Archive for the ‘OpenSolaris’ Category

iPXE: Now with Native Menus and SmartOS Support

Monday, October 8th, 2012

If you’ve never heard of iPXE, it is the official fork of gPXE, which was the ultimate result of the Etherboot Project of old.  Apparently there was a power struggle that caused the primary contributors to leave Etherboot/gPXE and they renamed gPXE to iPXE to distinguish.  Technically gPXE still exists, but for all intents and purposes its a dead project.

If you are completely unfamiliar with both iPXE and gPXE let me summarize.  The industry standard way to network boot is via PXE.  A PXE client is burned into the ROM of your NIC, but because it has to fit in a tight space it is very dumb.  iPXE is an open source PXE client that is modern and very intelligent.  It can execute scripts, it can inspect the system interfaces and SMBIOS, it can download images and scripts via HTTP, FTP, NFS, and more, it has SAN support for booting off of AoE, FCoE, and iSCSI, etc.  It can be used in several ways, including burned into your NIC’s ROM as a replacement (uncommon), booted from USB/ISO/etc media, or most typically it is itself PXE booted such that your dumb PXE client in your NIC boots to iPXE and it then does all the heavy lifting.  If you are doing any type of network booting you should know what iPXE is and if you ever want to do anything fancy, iPXE is the way to do it.  One example many of us like to use is creating an iPXE script which calls out to a web app (PHP commonly) which looks up information from SMBIOS (such as serial number, service tag, MAC address, etc.) and interfaces with a database to make decisions on which image to boot.  You can do lots of fun things.  Most of your next-gen bare metal provisioning tools, such as Razor, rely on iPXE.

There are two really exciting things for me, just added in the last couple months.  The first and most basic is that SmartOS boots natively from iPXE.  In the past, primarily with OpenSolaris, you had to chainload PXEGRUB to boot Solaris, but it looks like some patches were accepted and now you can dump GRUB completely.

The other existing development is the addition of native menus in iPXE.  Historically, if you wanted to create a versatile netboot server you would use iPXE/gPXE to  chainload SYSLINUX’s menu.c32 program which would render your boot selection menu and boot your selected OS.  But no more!  iPXE can do it all on its own now thanks to the addition of 3 commands to iPXE: menu, item, and choose.  With these new commands and liberal use of “goto” labels you can create some extremely complex and powerful setups with no other helper programs in the way.

Lets take a look at a simple menu:

#!gpxe

######## MAIN MENU ###################
:start
menu Welcome to iPXE's Boot Menu
item
item smartos    Boot SmartOS
item
item shell      Enter iPXE shell
item reboot     Reboot
item
item exit       Exit (boot local disk)
choose --default smartos --timeout 60000 target && goto ${target}

## Utility menu items:
:shell
echo Type exit to get the back to the menu
shell
set menu-timeout 0
goto start

:failed
echo Booting failed, dropping to shell
goto shell

:reboot
reboot

:exit
exit

########## MENU ITEMS #######################
:sdc
kernel /sdc/20121001T165806Z/platform/i86pc/kernel/amd64/unix -B hostname=r720test,standalone=true
initrd /sdc/20121001T165806Z/platform/i86pc/amd64/boot_archive
boot

:smartos
kernel /smartos/20121004T212912Z/platform/i86pc/kernel/amd64/unix
initrd /smartos/20121004T212912Z/platform/i86pc/amd64/boot_archive
boot

You can see here that the “menu” command declares a menu with a title. The elements are items with a label and description (you can assign hot keys as well) and an item with no value is an empty line, and you can use the “–gap –” argument to create section headers, in the form “item –gap — —–SmartOS——-”. Finally, the choose command puts your selection into a named variable and also allows you to specify a default selection and timeout specified in milliseconds. Just about everything else is handled by the “goto” command and labels sprinkled throughout the script. Most importantly, we use the value obtained by the choose command to “goto” the label with the commands to boot the given OS. You can also have multiple menus, one which goes to the other and back, by being creative.

When you couple all this together, you get an iPXE that is more powerful than ever before and extremely exciting.

I’ve taken this opportunity to update the SmartOS Documentation for PXE booting,  using iPXE directly as above is now the officially recommended way to netboot.

Solaris Family Reunion: TOMORROW!

Monday, October 3rd, 2011

Sorry for the late notice, but all you folks out here in the Bay Area for OracleWorld won’t want to miss out on a very exciting event tomorrow night:

  • What? Solaris Family Reuinion
  • Where? Joyent HQ, 345 California St, 20th Floor
  • When? Tuesday Oct 4th, 6PM till 10PM (and maybe a pub after that!)
  • Why? Beer! Food! Community!
  • Register here: http://smartos-estw.eventbrite.com

We’ve all gone off in different directions, but this will be an amazing and rare opportunity to get the band back together, share stories and talk about the future and just have a good time as a Solaris community.  You will not want to miss it!

Using Graphite to Graph DTrace Metrics

Tuesday, June 21st, 2011

If you haven’t heard of Graphite you are missing out on a serious operations power tool. Let me make a gross over simplification and slightly inaccurate assertion to get you in the ballpark of understanding what it is: it’s RRDtool reimplemented for the web.

Let me be more specific for those new to it. Graphite is really made up of 3 components. The first is “Carbon” which is a metrics collection daemon that collects data via a UDP socket, caches the data and then records it to disk. The second is “Whisper” which is a round robin database that permanently stores your metrics on disk that is used by Carbon. The third is a Django app which can generate graphs based on your metrics via a snazzy web UI or via a simple URL API. So it implements an RRD database like RRDtool and a means of graphing the data like RRDtool but its accessible via a browser and graphs dynamically, so unlike RRDtool it isn’t necessary to pre-render static graphs at some interval.

There are 3 reasons I really find it hard to ignore Graphite. Firstly, you do not need to pre-generate your databases, if you send it a metric it hasn’t gotten before it just creates the database based on a flexible schema configuration. Secondly, you can get your graphs essentially in real-time by just refreshing a URL, no pre-generation. Thirdly, you can send it metrics using something as simple as netcat. The result is an insanely flexible metrics graphing system with very little configuration required and no agents necessarily.

So let me demonstrate how we can use all this power together with DTrace in a sample script:

#!/bin/bash
# Example DTrace/Graphite Integration
# Ben Rockwood 

export HOSTNAME=`hostname`
export GRAPHITE_SERVER="10.0.0.22";

/usr/sbin/dtrace -n '

#pragma D option destructive
#pragma D option quiet

BEGIN
{
        mycounter = 0;
}

syscall::read:entry
{
        mycounter++;
}

tick-1sec
{
        /* system("echo \"DEBUG: Sending data to metric dtrace.$HOSTNAME.syscall.read.entry
                                    on server $GRAPHITE_SERVER\" "); */
        system("echo \"dtrace.$HOSTNAME.syscall.read.entry %d %d\" | nc $GRAPHITE_SERVER 2003 ",
                     mycounter, walltimestamp / 1000000000);
        mycounter = 0;
}
'

So what I’m doing here is running a DTrace script via BASH. I’m using BASH as a wrapper so that I can do setup such as get the hostname. The DTrace script itself is overly simplistic, we’re just counting read system calls and incrementing a counter. The “tick-1sec” probe will fire every second during which it will reset the counter and run a system command. System commands can be destructive, so you’ll notice that pragma is set.

The system command we’re executing simply echos the metric in Graphites format and pipes it to netcat (“nc”) which sends it to the Graphite server. The format is simple: “some.metric.name value epoch_time” My metric here will be dtrace.newton.syscall.read.entry. (Newton is my workstation.)

I start that running and then go to the following URL:


http://10.0.0.22:8888/render/?width=400&height=250&target=dtrace.newton.syscall.read.entry&from=-1hours

And this is what I see:

See how flexible it is? If I wanted to run this on 4 web servers I could fire up the script, unmodified, on all 4 servers and then simply modify the URL to change the hostname in the target from “newton” to “*” and it would graph all 4 together, without having to even log onto the Graphite server. This is why I love Graphite, its so flexible you can pretty much cram it in anywhere and get useful data in a pinch.

Word of warning: The script above is intentionally over simplistic. My point here is to illustrate the basic principles, nothing more.

Duo Security: Two Factor Auth for the Masses

Thursday, June 9th, 2011

Smart Cards, OTP, Hardware Tokens like SecurID… 2 factor auth is an old standby and considered mandatory for any high security installation.  But lets face facts, there are a myriad of problems involved.  SecurID is complex and expensive and now has destroyed its credibility following the Lockheed break-in.  Smart Cards are really sweet, especially solutions from ActivIdentity, but again its expensive and you have client hardware requirements which can be a problem with many users.  OTP is nifty but most of the solutions out there are ancient and may not work with the platform your using.  But… that is the price of security right?  And what about all these new cloud deployments, traditional 2 factor solutions for your cloud?  Just shoot me.

Today I stumbled across Duo Security and was amazed.  It is an entirely modern 2 factor auth system that uses a SaaS model, open source client software and open APIs, integrates with just about anything, and uses the phone you already have in your pocket.

Whats amazing is that the guys a Duo have nailed the setup.  You go to Duo Security and sign up for an account, before you’ve registered they’ve already verified your phone via an automated voice call.  You finish the easy wizard and within 2 minutes your looking at their dashboard with a free account that supports up to 10 users.  For a UNIX system you download and compile their software (packages are available for Linux distros) which has a client program as well as a PAM module.  You add a new “Integration” (essentially an auth realm with its own API key) and feed the keys into the client configuration (which is only 3 lines long, btw) and run the client which gives you a URL to finish validating the host and your done.  10-15 minutes after first hitting their website you are up and running 2 factor security without a bit of pain.  Its so simple is just makes me smile… and how often does anything security related do that?

Duo supplies special variations of the service that are just as easy for Juniper, Cisco and Sonicwall VPN’s as well as a Web API… but I’m not going to address those here.

Once your UNIX host is setup, you have some options on how to employ it.  You can use PAM, which will make all users dual auth via Duo, or you can use a nifty per-user SSH trick by adding a command=”/usr/local/sbin/login_duo” to the beginning of your public key in the .ssh/authorized_keys file (which I didn’t even know was possible).  If you don’t have the ability to modify PAM this SSH hack is a great solution.

But whats really important is the experience of actually using it for auth.  Here is how it works for real using an SSH session.  When logging into your system and after accepting your password or key as usual, it stops the auth process and asks how to contact you:

Ben-Rockwoods-MacBook-Pro:~ benr$ ssh cuddletech.com
Password:
Duo login for benr

 1. Duo Push to XXX-XXX-1100
 2. Phone call to XXX-XXX-1100
 3. SMS passcodes to XXX-XXX-1100
Passcode or option (1-3): 1

Pushed a login request to your phone...

At this point the SSH is stuck.  Notice you have 3 choices: Duo Push (smartphone app), phone or SMS.  Duo Push is a free app for Android and iPhone which can accept push notifications.  When you do your setup part of the process will be installing this app if you wish, which only takes 2-3 minutes.  If you choose to use Duo Push, as I did, you’ll see something like this on your phone:

After accepting, your SSH session comes back to life:

Success. Logging you in...

Last login: Wed Jun  8 22:57:20 2011 from xxxxxx
                                __                       __
                       __      / /___  __  _____  ____  / /_
                    __/ /___  / / __ \/ / / / _ \/ __ \/ __/
                   /_  __/ /_/ / /_/ / /_/ /  __/ / / / /_
                    /_/  \____/\____/\__, /\___/_/ /_/\__/
                                    /____/
[cuddletech:~] benr$

It’s that easy!

Duo just got everything spot on, its easy, the documentation is clear and concise, its just beautiful.  The best part of it all is that its free for less than 10 users, which means that if you just have a single web server you wish to secure, you can!  Thanks to the SSH hack above you could even do it on a Shared Hosting account.  There is even a plugin for WordPress to use Duo for WP login.

To get started with it yourself, I recommend this post on the Duo blog: Announcing Duo’s two-factor authentication for Unix.  It walks you quickly through the whole process I described above.

In all fairness, I’ve only been using this for less than a day so I’m sure there are kinks I’ll run into and things to be improved, but it truly is amazing that I’ve got what feels like a solid solution working so quickly.  Auditing and logging gets a lot more interesting when you don’t have to second guess whether or not the user is in fact the user you think and this product opens up a lot of new possibilities and fills a much needed gap in the world of cloud security.

NOTE FOR OPENSOLARIS/ILLUMOS PAM USERS:

After you download and unpack duo_unix-1.6.tar.gz, run “./configure –enable-pam”.  Before you run “make” edit config.h and comment out the the line “#define HAVE_ASPRINTF 1″.  After that PAM will compile fine.  If you don’t, you’ll get “pam_extra.h:10: error: syntax error before “va_list”".  Also, make sure that you have an ‘sshd’ user for Duo to use.

In Case You Have Doubts…

Tuesday, November 16th, 2010

I hesitate writing this entry, because no good can come of it. But I want you to grasp the new reality.

This is mail that just came across the OpenSolaris Security list:

On 11/03/10 20:07, Pete Chan wrote:
> Hello. Does any one know if Oracle has any plans in incorporating SSH
> HPN in the new release of Solaris?

Oracle's plans for features in future releases of Solaris are unlikely to be communicated here.

Please contact Oracle via your normal support/sales channels.

--
Darren J Moffat

WOW! Talk about slamming the door on the fingers. Not even the courtesy of a BS response like “Good idea” or “Unlikely due to its current state” or something…. just slams the door. I’m amazed and saddened.

Solaris 11 Express Arrives

Monday, November 15th, 2010

Its here… Solaris 11 Express. Billed as snv_151a.

You can download it immediately. Be careful when you do so, the version most people will want is the LiveCD, which is under the “Other Downloads” section. If you download the “Text Only” version you will not get a desktop environment. Maybe a subtle hint about the return of Solaris as a server centric OS.

Lets get the bad news out of the way…. No source. If that’s a deal killer for you, go download OpenIndiana based on Illumos.

So, if you aren’t as principled and just want to get back to the latest and greatest from Sun/Oracle you can wipe out your workstation and have a go with the LiveCD. So long snv_134 (ie: the now dead last release of OpenSolaris).

The important news is really for enterprise shops that plan to continue with Solaris. If you’ve spent time with OpenSolaris then you already know most of what you need to. If you didn’t soak into OpenSolaris then you need to start getting up to speed now. Your primary concern will be IPS, ZFS Boot Environment (no more UFS root), and Automated Installer (AI). You’ll need all the time you can get to familiarize yourself with these big changes prior to Solaris 11′s real arrival.

Besides the “catchup” from snv_134, which is itself significant, namely in the networking space (VRRP, w00t!), snv_151 includes ZFS Encryption. ZFS Crypto has been talked about for years now, but its here at last. Darren Moffat shared the good news in his blog. Give it a read for all the crypto-ee goodness.

So sink in your teeth and enjoy the sweet taste while it lasts. The gravy train is over and I don’t expect much more to write-home about, some tweeks but nothing fantastical. Get ready for the new normal, and keep Illumos visible in the corner of your eye to see some real innovation start to unfold there.

Oracle Solaris Summit: Today

Tuesday, November 9th, 2010

LISA is happening this week in San Jose and today is the Solaris Summit. Its a free event, so if your in the Silicon Valley come by… if not, the rest of you can watch the live stream.

I wouldn’t expect much in the way of surprises in the event. If your reading this your already tapped into the OpenSolaris community and thus know whats coming.

Silicon Valley OpenSolaris User Group Lives: Meeting Tonight!

Thursday, August 26th, 2010

Sorry for the late notice, but SVOSUG is meeting tonight. Myself and several folks from the Joyent crew will be onhand.

6:45pm
274 Castro Street, Suite 204
Mountain View
above Meyer Appliance & Kitchens look for the OpenSolaris sign on the door

Tonights guest will be Garrett D’Amore presenting Illumos and Anil Gulecha presenting Nexenta.

The discussion will really be in essence about the rebirth of OpenSolaris in a post-Oracle era.

If you can’t attend in person, it will be webcast: http://www.ustream.tv/channel/svosug-feed2

Be there in person or attend the webcast, but don’t miss it!

A big thanks goes out to Alta Elstad for keeping the faith and keeping SVOSUG alive! Alta rules!

Illumos Shines New Light

Saturday, August 7th, 2010

As many of you have no doubt heard, this week Illumos opened its doors to the world.

What is Illumos? A change to put “OpenSolaris” back on track. When this slay ride started, “OpenSolaris” wasn’t a distribution, it was a community. It was users and developers and sysadmins gathering around a great operating systems code, now free to learn from, contribute to, and to innovate on. But that’s not how it really went down… is it?

Illumos isn’t a fork. There is no such desire. We’re simply moving the code out into the community, where it belongs, and leaving the corporate red tape behind. Garrett D’Amore, who has spearheaded this and will serve as our benevolent dictator for the time being, has already invited Oracle to participate. I really hope that they do. We have here now a way for the community to contribute better than ever before, and a way to cross-pollinate with the Oracle gate in an orderly way. By keeping them in sync we can share between the two as we wish.

Whats best of all is that while Garrett is a Nexenta employee, this is not a Nexenta owned project. Nexenta will use it, as will Joyent, as will Belenix, as will anyone else who desires.

While I wish we didn’t need to setup an external community repository, all other alternatives have been exhausted.

This isn’t really even the first time this has happened. At Genunix there was for some time SVN Repostitories maintained… they simply didn’t get much love. Whats different this time is that there are an increasing number of developers that depend on this codebase which can not be at the continous mercy of Oracle. We can find security in having our own community gate.

I personally applaud Garrett for his decisive leadership and Nexenta for allowing him to pursue this. The future is looking a lot more bright and I really hope that Oracle will join in and we can all work hard to innovate on this amazing platform, together.

OGB Threatens to Shoot Itself In The Head

Monday, July 12th, 2010

This morning, at the 8AM (Pacific) OpenSolaris Governing Board (OGB) meeting, the following was proposed and unanimously resolved:

The OGB is keen to promote the uptake and open development of OpenSolaris and to work on behalf of the community with Oracle, as such the OGB needs Oracle to appoint a liaison by August 16, 2010, who has the the authority to talk about the future of OpenSolaris and its interaction with the OpenSolaris community otherwise the OGB will take action at the August 23 meeting to trigger the clause in the OGB charter that will return control of the community to Oracle.

That is to say, “start talking to us or we’ll just shot ourselves in the head.”

I made my opinion very clear via the IRC back-channel during the call. At least my call for a liaison was added into the resolution, but I am extremely opposed to this cowardly act.

What exactly do we have to gain or Oracle to loose? All Oracle does is runs out the clock, the entire OGB resigns, and then the one little bit of control the community has is gone. What motive, other than a benevolent act to garner press attention, does Oracle have to comply? We’ve just made their job easier.

I once advocated this kind of self-implosion tactic back in the Sun days. The reason was to re-organize the OpenSolaris leadership to be more engaged and industry focused. That was a good idea back in the days when I had faith that Sun would “do the right thing”. However, those times have past. Oracle has made it clear that it either controls things or it doesn’t… there is no give and take. I don’t think we can demolish the structure and believe that Oracle will re-organize in such a way as to give the community more power. It was a long shot with Sun anyway.

Frankly, imho, this is just the OGB throwing its hands in the air. The body has been useless for a long time, but only because it has chosen to be. The majority of the OGB’s life its wasted by trying to restrict its own authority by endlessly debating and re-writting the constitution. Its never lead anything, and it isn’t now.

But the fact that its a wet rag doesn’t mean we should simply throw in the towel. A weak seat of power is better than no seat at the table.

So where do we go from here? Who knows. At this point the die is cast and OGB is putting up their last stand. Maybe Oracle gets serious and does something, but I really doubt it. Not because they can’t, but because its not in their best interest. Why kill something intent on killing itself.

My only concern as this point is to not loose regular code updates and access to the bug database. Yes, the existing code is “out there”, but Oracle is still the biggest contributor, 99.999 to 1. Anyone can fork at any time right now, as is, so if your going to do that why would you risk cutting off the huge contributions continuously made by Oracle?

We’re in no worse a position right now than we were during the Sun days. They didn’t communicate, we had no visibility or impact on the OpenSolaris distribution, etc. Don’t fall into the lie that things are now “worse” than they were… they aren’t. Its status quo. The difference is that the OGB is no longer composed of Sun insiders who can get a sense of control from hallway conversations and are now as blind and weak as those of us in the community always have been.

The request for a liaison is a good one… I support it. But damnit, put the gun down. We don’t need to act like irrational children having a tantrum. Ultimatums rarely workout the way you hope.

The bar is lower than the original resolution was, so we’ll hope for the best and see.

UPDATE: OGB Member Peter Tribble has written a blog entry about this action, recommended reading. While I disagree with the action, Peter is a great guy whom I greatly respect.