Archive for the ‘SysAdmin’ Category

Duo Security: Two Factor Auth for the Masses

Thursday, June 9th, 2011

Smart Cards, OTP, Hardware Tokens like SecurID… 2 factor auth is an old standby and considered mandatory for any high security installation.  But lets face facts, there are a myriad of problems involved.  SecurID is complex and expensive and now has destroyed its credibility following the Lockheed break-in.  Smart Cards are really sweet, especially solutions from ActivIdentity, but again its expensive and you have client hardware requirements which can be a problem with many users.  OTP is nifty but most of the solutions out there are ancient and may not work with the platform your using.  But… that is the price of security right?  And what about all these new cloud deployments, traditional 2 factor solutions for your cloud?  Just shoot me.

Today I stumbled across Duo Security and was amazed.  It is an entirely modern 2 factor auth system that uses a SaaS model, open source client software and open APIs, integrates with just about anything, and uses the phone you already have in your pocket.

Whats amazing is that the guys a Duo have nailed the setup.  You go to Duo Security and sign up for an account, before you’ve registered they’ve already verified your phone via an automated voice call.  You finish the easy wizard and within 2 minutes your looking at their dashboard with a free account that supports up to 10 users.  For a UNIX system you download and compile their software (packages are available for Linux distros) which has a client program as well as a PAM module.  You add a new “Integration” (essentially an auth realm with its own API key) and feed the keys into the client configuration (which is only 3 lines long, btw) and run the client which gives you a URL to finish validating the host and your done.  10-15 minutes after first hitting their website you are up and running 2 factor security without a bit of pain.  Its so simple is just makes me smile… and how often does anything security related do that?

Duo supplies special variations of the service that are just as easy for Juniper, Cisco and Sonicwall VPN’s as well as a Web API… but I’m not going to address those here.

Once your UNIX host is setup, you have some options on how to employ it.  You can use PAM, which will make all users dual auth via Duo, or you can use a nifty per-user SSH trick by adding a command=”/usr/local/sbin/login_duo” to the beginning of your public key in the .ssh/authorized_keys file (which I didn’t even know was possible).  If you don’t have the ability to modify PAM this SSH hack is a great solution.

But whats really important is the experience of actually using it for auth.  Here is how it works for real using an SSH session.  When logging into your system and after accepting your password or key as usual, it stops the auth process and asks how to contact you:

Ben-Rockwoods-MacBook-Pro:~ benr$ ssh cuddletech.com
Password:
Duo login for benr

 1. Duo Push to XXX-XXX-1100
 2. Phone call to XXX-XXX-1100
 3. SMS passcodes to XXX-XXX-1100
Passcode or option (1-3): 1

Pushed a login request to your phone...

At this point the SSH is stuck.  Notice you have 3 choices: Duo Push (smartphone app), phone or SMS.  Duo Push is a free app for Android and iPhone which can accept push notifications.  When you do your setup part of the process will be installing this app if you wish, which only takes 2-3 minutes.  If you choose to use Duo Push, as I did, you’ll see something like this on your phone:

After accepting, your SSH session comes back to life:

Success. Logging you in...

Last login: Wed Jun  8 22:57:20 2011 from xxxxxx
                                __                       __
                       __      / /___  __  _____  ____  / /_
                    __/ /___  / / __ \/ / / / _ \/ __ \/ __/
                   /_  __/ /_/ / /_/ / /_/ /  __/ / / / /_
                    /_/  \____/\____/\__, /\___/_/ /_/\__/
                                    /____/
[cuddletech:~] benr$

It’s that easy!

Duo just got everything spot on, its easy, the documentation is clear and concise, its just beautiful.  The best part of it all is that its free for less than 10 users, which means that if you just have a single web server you wish to secure, you can!  Thanks to the SSH hack above you could even do it on a Shared Hosting account.  There is even a plugin for WordPress to use Duo for WP login.

To get started with it yourself, I recommend this post on the Duo blog: Announcing Duo’s two-factor authentication for Unix.  It walks you quickly through the whole process I described above.

In all fairness, I’ve only been using this for less than a day so I’m sure there are kinks I’ll run into and things to be improved, but it truly is amazing that I’ve got what feels like a solid solution working so quickly.  Auditing and logging gets a lot more interesting when you don’t have to second guess whether or not the user is in fact the user you think and this product opens up a lot of new possibilities and fills a much needed gap in the world of cloud security.

NOTE FOR OPENSOLARIS/ILLUMOS PAM USERS:

After you download and unpack duo_unix-1.6.tar.gz, run “./configure –enable-pam”.  Before you run “make” edit config.h and comment out the the line “#define HAVE_ASPRINTF 1″.  After that PAM will compile fine.  If you don’t, you’ll get “pam_extra.h:10: error: syntax error before “va_list”".  Also, make sure that you have an ‘sshd’ user for Duo to use.

Personal Must-Haves in the Data Center

Tuesday, May 3rd, 2011

When you go into the data center, either for rack’n'stack or maintenance, there are a couple of things that can make your life easier.  You want to go light, of course, but also have everything you need so that your not going to have to post-pone work due to lack of gear.

Common must-haves include:

  • A very capable laptop.  This is your primary tool.  I prefer to 15″ MacBook Pro, but whatever you use you’ll want gigabit ethernet, wifi, serial capabilities, etc, etc.
  • An RS-232 serial to USB adapter.  I use a Keytronics adapter with my Mac.  For software I use the Keyspan Serial Assistant and ZTerm.
  • Several different serial cables and gender benders.
  • A good bag.  I prefer the Ogio Hip-Hop or Timbuk2 Commute 2.0 but many like backpacks or other types of Messanger bags.

But those are boring essentials… here are my not-as-normal must-haves.

1. Leatherman Skeletool CX Multitool

The perfect evolution of the Leatherman. The CX is made from carbon fiber and is extremely light, but feels solid. I love the Carabiner which I clip to my front belt loops, which means its not on some sheath I’ll loose or in my pocket scratching my phone. Clipped in front I forget that its there, its that light. It also has a pocket clip if you prefer.

The blade is excellent and it has all the right features. The universal bit driver is great if you want to carry an optional sheath with a variety of bits, but on the tool there is a single storage slot, which means you always have 2 bits (dual sided) on the tool at all times.

The only downside to the tool is that because it is so light and small that if you take if off your person you can easily forget it. I did this once and went nuts until I got another one a week later.

2. Contigo Autoseal 16oz Mug

Everyone knows that liquid is forbidden in the data center. Everyone also knows that its hard to enforce and rarely is. Never the less, no one wants to let you do it and no one wants to cause a problem. Furthermore, coffee in a Starbucks paper cup goes cold due to HVAC in the data center in 15-20 minutes.

The Contigo mug is the best coffee mug I’ve ever seen. It feels good in your hands, good to drink from, and fits easily into your bag. But most importantly, it is completely split proof. It is the only coffee mug I’ve ever trusted so much that I would put it inside my bag. It keeps your coffee hot and won’t spill… what more can you want? If I’m warned by others in the data center I will take it and flip it in the air and catch it, just to show how solid it is.

$20 is a lot for a coffee mug, but its worth it. I have 6.

3. Contigo Autoseal 24oz Waterbottle

This is somewhat redundant, but when your in the DC for a long time, you need water. A refillable water container is best. The Contigo Autoseal is just as robust as the mug, but adds an excellent clip that allows you to attach it to your bag.

I should also note that the flow from both the mug and the water bottle is really good, unlike many others. You get a solid gulp, unlike drinking from a straw.

4. Apple iPhone 4

The iPhone is the ultimate tool. Take photos of servers or critical screens, take walk-around movies or record screens for later review, listen to music, make calls, send email, etc, etc, etc. Working in the DC is definitely much improved thanks to the iPhone 4.

5. IntelliScanner Pro 200 Barcode Scanner

For all the great things that the iPhone can do, its various bar-code reader applications are horrible. They certainly aren’t quick and they have trouble with small barcodes. The IntelliScanner is easy to use and fast. Your laptop will register it as a keyboard, so when you press the button to scan the contents of the barcode are “typed in” where ever you like, which means you can use it with Excel just as easily as my prefered auditing format, CSV’s created in vi. ;)

Happy SysAdmins Day

Friday, July 30th, 2010

Its that time of the year again. Happy SysAdmin Day everyone.

If today is dragging, might want to refresh your memory of the great OddTodd… always a pick-me-up.

FAST 2010 Proceedings Available

Saturday, March 6th, 2010

I’ve missed FAST 2010 yet again…. but, good news! The complete FAST 2010 Proceedings (PDF) are available for free. USENIX members can also view the presentation videos online.

Solaris Spit & Polish

Tuesday, February 10th, 2009

An interesting discussion has been taking place on the OpenSolaris SysAdmin Community list, and I sense it will lead us toward some important changes in Solaris. Essentially it all comes down to the lack of spit and polish. What has always been something we perhaps ignored or downplayed has become far more starkly contrasted by truly easy to use yet complex things such as ZFS or SMF.

The clearest examples are technologies that currently are essentially useless without custom scripting. Such examples include LDAP, Extended Accounting, and BSM Auditing.

LDAP is one that’s really concerned me. Almost any Solaris environment would benefit greatly from an LDAP/Kerberos implementation, for ease of management and increased security… but frankly, just dropping in a directory server and authenticating to it isn’t so straight forward. Populating and maintaining the DIT is complex, commonly requiring custom scripts and possibly a 3rd party LDAP Browser. While the aging idsconfig script is suppose to jumpstart your experience, its not perfect and is tailored to Sun DSEE. In the community we commonly see people scratching their heads wondering if other directory servers, such as OpenLDAP even work with Solaris and how to get started.

Microsoft hit a home run with ActiveDirectory, and it pains me in the same way that NetApp kicked Sun’s ass at building NFS servers. Sun is a systems company and the leading provider of directory/identity management products, but if you want to use them in conjunction with Solaris you’ve got a lot of custom work to do. As far as Kerberos, most of the use continues to be in academic environments, which means that the best means to secure NFS in a corporate environment just isn’t used.

Sun is very good at engineering the big things, but I’ve noticed that when it comes to connecting all the dots they tend to turn toward the path of acquisition. A need arises for a management app or something, they find a decent software company doing it, aquire them, and then slowly let the thing rot. I mean, how many people still use Sun Management Center or N1 Provisioning Server? (Or ever did for that matter.)

A lot of focus has gone into the GNU-ification of Solaris and improving the desktop experience with Indiana… I mean OpenSolaris… but at some point we’ve got to get back around to focusing on what Solaris does best, being the enterprise class server operating system we know and love.

This is especially important in the face of Cloud Computing. The cloud needs solid server operating systems, and Solaris leads the pack. If we’ve proved one thing with Solaris 10, its that making Solaris more like Linux doesn’t have nearly the impact we hoped it would, but making the complex very simplistic and straightforward (ZFS, DTrace, SMF, FMA, …) is dramatic.

Monitoring, Management, and Infrastructure is what we need. Easy, quick, and powerful. We have the technology underneath, we just need to bring it all together.

What say you?

Storage Trends from SNIA SDC

Wednesday, October 1st, 2008

The Storage Networking Industry Association’s (SNIA) Storage Developer Conference (SDC) is not, as the fancy name suggests, not a place for storage hobbyist or the light hearted. Attendees are leaders in our industry, highly informed and knowledgeable. If they are interested in it, we all will be soon. If you follow the storage press at all, the two big things on their mind won’t surprise you:

  1. De-duplication
  2. Solid State Disk (SSD)

From performance talks, to corruption analysis talks, to ZFS talks, to NFSv4 talks, every session included a slide for or was asked a question about both of these. Frankly, there were very few answers. Sun’s “hybrid storage architecture” for ZFS (for those in the know, this is L2ARC and ZIL offload, which are put on special SSD’s). Most of the talks only noted “SSD will change everything… its too early to tell how.” Given that the concern of the show is largely on primary storage, not secondary backup, de-dup was constantly come up but rarely had a place.

If de-duplication is a new term for you, here’s the quick and dirty pitch. Imagine having to architect backups for 300 helpdesk PC’s, all are running a standardized Windows XP, office stack, plus helpdesk support and naturally other user applications. Lets say the average PC has 80GB of data on its local drive. So thats 300 * 80GB to back up, perhaps nightly. A nightmare. Historically, to reduce the backup load by either putting user home directories on a centralized file server and just not backup PC’s, only the file server, or you’d exclude paths such as C:/Windows (or whatever the hell they call it now). De-duplication typically uses hashing algorithms either on the client or on the backup server to reduce storing duplicate data blocks. So that means you only backup one copy of Windows XP, and then 299 references to it. If someone sends out a PDF of the company handbook thats 5MB, and there are 300 local copies of it, thats 1.5GB of the same file, but with de-duplication we store only a single 5MB file plus references to it.

From the example you can see that customers backing up Oracle databases or customized purpose build servers might not be in dire need of this technology (although they are interested too), but if your backing up server farms or desktop systems this is something you can’t wait another second to get your hands on; especially if your backing up to tape!

I should note, de-dup is becoming more than just a backup technology. Storage admins see applications for file servers and other applications. I’m certain that in 5 years de-duplication methodology will be used in ways I’d laugh at today.

As for SSD. Its coming. I remember 10 years ago in a lab where we had a “Solid State Disk”, which in the pre-flash era meant a box with bank upon bank of RAM and a big battery. Today SSD is cheap and getting cheaper. But how will they be used?

Today we have the concepts of “tiered storage”. This means different things based on who you talk to. In some cases such as Pillar Data this is done by partitioning drive cylinders so that tier 1 data is on the outer (faster) tracks and tear 2, 3, 4 on the inner (slower) tracks. In other cases this means putting important fast access data on smaller 15K or 10K RPM FC or SAS disks as “tier 1″, and bulk data on larger “nearline” 7,200 RPM SATA disks. For customers using HSM (Hierarchical Storage Management) you can even automate the data migration back and forth across tiers, all the way out to tape drives which was untill recently cheaper per gig than disk.

So many storage administrators and architects seem to see SSD pushing into tier1 and pushing 15K spinning media down the stack. Instead of Fast, Slow, Tape, you get Super-Fast, Fast, Slow and potentially just dump tape.

I know I’m a zealot, but Sun really is leading the charge here. The Hybrid Storage Pool architecture is really brilliant because it views SSD not as faster disks, but rather as slow (relatively of course) non-volatile memory. Traditionally you have an in-memory filesystem cache (ZFS’s is called “ARC”), data flows through the cache and eventually is ejected to make room for fresher data meaning that if you call that data again you go out to disk. ZFS’s L2ARC (Level 2 ARC) extends your in memory disk cache using SSD, so if you go back for data you don’t have to go all the way out to disks. On busy file servers this is a massive win! A 64GB SSD is a really small disk, but as a secondary disk cache its massive! Plus, there is no management involved on the administrators part, no data policy or data classification to work out, the filesystem handles it for you.

Sun’s other component to the ZFS Hybrid Storage Architecture is ZIL Offload. Most data access is asynchronous can be nicely cached and writes flushed to disk when its convenient. However, some applications such as databases or NFS do synchronous (O_DSYNC) IO, this flag requires that the filesystem immediately flush the data to stable storage. On a busy file server this is a performance killer. ZFS ZIL (ZFS Intent Log) is where these synchronous writes go; by putting those writes on super-fast SSD you get several orders of magnitude performance improvement without relying on things like RAID Controller Write Back Caches.

Since we’re talking about SSD, let me point out that not all SSD’s are created the same. There are two main types of SSD on the market right now: MLC and SLC. Here’s the 60 second explanation:

  • Single-Level Cell (SLC): These flash devices have higher performance, more write/erase cycles and thus greater endurance, use less power, but cost much more. These are generally considered “Enterprise Grade SSD”.
  • Multi-Level Cell (MLC): In contrast to SLC, these devices have lower performance, less endurance, but offer might higher density and lower cost per bit. If you see a “cheap” $300 SSD at Fry’s or NewEgg its almost certainly MLC. These are generally considered “Consumer Grade SSD”.

If you see a Sun presentation on Hybrid Storage, you’ll see them refer to these as “Read Biased” (MLC, slower but higher capacity) and “Write Biased” (SLC, faster but less capacity). By using the appropriate technology in the appropriate role they significantly reduce cost for an SSD deployment. If you look at everyone else out there just viewing SSD is “fast disk”, the decision between SLC and MLC is really just a matter of cost; if you can afford SLC great, if not MLC, or perhaps even sub-teiring SLC to MLC SSD.

So thats de-dup and SSD. If you haven’t heard of these, you will. Familiarize yourself with the basics now, you’ll be better prepared for the future.

On a closing note. I talked to several people about SMART data. I’m shocked by how many people tell me to ignore SMART data as untrustworthy and unreliable. I was hoping someone at the show would disagree… I was disappointed. Most other experts agree, vendors don’t trust SMART data and in some cases outright “fudge” the data or at the least disregard conclusions based on the data. On person remarked that most drives sent to Seagate due to a SMART suggested failure are simply scrubbed, cleared, and re-shipped. So, the belief that SMART data is something to be seriously monitored by admins continues. If you have it, nifty, but if not, oh well. As for me… I love telemetry, so SMART still has a warm spot in my heart, wrinkles and all.

UPDATE:: Just hours before I wrote this, Mr. Harris of StorageMojo wrote about NetApp’s efforts to bring de-dup to primary storage.

SA Pro Episode 0: Education and Qualifications

Friday, August 29th, 2008

The very first episode of SA Pro is here!

In the podcast we’ll use one of two formats, classic 1-on-1 interview style and a round-table discussion format. This episode is the latter.

Together with Joe Moore of Siemens and Mark Imbriaco of 37signals we discuss the following questions:

  1. What is the mark of a good SA?
  2. What are the essential qualifications?
  3. Does formal education and/or certs matter?

Whats really new and unique is that Joe, Mark, and I don’t know each other. They both responded to a request for participants on the OpenSolaris SA’s list and matched the qualifications I was aiming for, thats the extent of it. This is interesting because even though the three of us are in very different circumstances, have different histories, and are geographically separated, we’re not very dissimilar. It amazes me how much unity there is among a group with so few governing institutions.

The podcast is 1hr 6 mins and definately worth a listen. Feedback is appreciated, but this was the first one, so be kind. (Yes I know my audio was too low.)

A huge thank you goes out to Joe and Mark for participating!

You & Your Hard Drive in the 21st Century

Tuesday, April 29th, 2008

If 10 years ago someone said “One day your wife will carry an extra hard drive in her purse”, I’d have rolled my eyes. On a recent trip to pick up a hard drive (to replace the piece of crap that died in my MacBook Pro; so far every Apple laptop we’ve owned has had an OEM drive die) I saw, to my amazement, this:

CaseLogic, the folks that made those CD cases we all used to have in our cars, is now making neoprene sleeves for 2.5″ hard drive enclosures. This is telling to me… CaseLogic decided that there was enough of a market to start peddling these. This says something about modern storage, says something about the expected reliability and mobility of spinning storage, and says something about the capacity of the ever more affordable flash storage in USB keys and such. And, the strange thing is, I just had to buy one.

But wait there’s more! The wall of 3.5″ enclosures had been pushed aside by a giant selection of 2.5″ enclosures, most of them powered by the USB line alone, no need for an exteral DC plug. And in the corner of the rack was this interesting toy:

This is a Thermaltake BlacX HDD Docking Station, it accomidates 2.5″ and 3.5″ SATA drives…. like a damned Nintendo cartridge! And, the really funny thing is you’ll find yourself blowing dust off the SATA paddle before inserting… oh the memories.

Most geeks, like myself, probly have a growing stack of SATA drives that aren’t terribly old but have fallen by the way side as storage capacities have sky rocketed and prices plummeted in the last 3 years. Sure, there are lots of snazy USB/Firewire/eSATA enclosures out there, but generally the drives aren’t worth it… but no longer is this a problem! Your old hard drives are now a very easy to use removable media for all your backup or temporary storage needs, no adapters or sleds required, just dust it off and slide it into the dock.

These two things, combined with the fact that your grandma’s new Dell is probly going to have a 1TB drive, something that didn’t seem possible in a 3.5″ form factor just a couple of years ago, and some hope that aerial density will provide 2.5″ with capacities well beyond 300GB in the future, as well as the coming wave of SSD solutions…. storage is looking to be at the peak of a wave thats going to crash out a lot of interesting things in the next couple of years.

Of course, what concerns me is that while bus speeds increase and capacities grow, throughput in real world situations is still low. 30MB/s is still considered pretty good in real-world usage because those poor little heads can only move so fast. Tiered storage combined with RAID is interesting considering the increases in arial density because the outer cylinders contain so much data, but with COW filesystems growing such as ZFS the data is increasingly spread around the platters if left unchecked which leads to slower transfer rates outside of the benchmarks. Bigger buffers can help, but in random workloads prefetch doesn’t help as the drive doesn’t know what sector to prefetch.

It wasn’t log ago that I was begging a storage vendor to keep sending me 72GB drive because the rebuild times for a failed 167GB drive scared me. Gigabit speed networks increase the utilization of storage over the network, but again, those drive heads can only move so fast. I’m really interested to see what comes in the next couple years to try and catch up the random throughput of drives with the capacities. Will SSD be the solution or can spinning media vendors pull a rabbit out of their hats? Unless they do, my hunch is that in 10 years enterprise systems will be shipping with SAS SSD drives and relegate spinning media to secondary storage.

Any way you look at it, some kool stuff is coming; storage geeks stay vigilant!

Up.Time Software: The Ultmate Monitoring Solution

Saturday, January 26th, 2008

A couple things here and there have kept me from continuing my series of posts regarding systems management solutions. One of the monitoring solutions I’ve planned to write about it Up.Time. While I haven’t had the time to write it, I was thrill to check my favorite site, SunHelp.org, and see that Super Admin Bill Bradford wrote an excellent review himself: Software Review: up.time 4 Enterprise Monitoring.

In my professional opinion, Up.Time is the best, most comprehensive, and most polished out of the box solution available at any price. Yes, its proprietary closed source commercial software… but, whether your using Zenoss, HP OpenView, Hyperic, or another other solution out there, your going to only get a small subset of monitoring capability without spending some time extending it yourself or digging around for modules written by someone else. Most, such as NetNMS or Zenoss, are limited by the OIDs exposed by SNMP and then extended by creating custom scripts that SSH into boxes every n seconds. Others such as Zabbix and Hyperic provide a client side agent that gathers up fairly generic information such as disk usage, CPU and memory usage, and maybe an odd and end on top. But Up.Time gathers a massive range of metrics, stores them all, and provides useful graphing and reporting capabilities, including report automation, to make it all very useful.

I’ve solved more than a few problems because of the realization that the historical data I needed to analyze a problem was already right under my nose because Up.Time had been gathering it and I didn’t even realize it. A great example is IO response time! I spent quite a bit of time ripping apart iostat.c to learn how to extend Zabbix, Hyperic, or other solutions to record a_svct… then I realized that all that data was already being gathered with Up.Time right out of the box. Not only does it gather single return metrics, it also stores useful multi-string return data such as the top CPU consuming processes during a given period. Just knowing that the CPU was saturated on Monday of last week isn’t enough! What was actually using that CPU? Up.Time can tell you, no modification required.

With all my searching to date, there is only one “install and forget” solution on the market, and thats Up.Time. If you want to solve your monitoring problems with money its hands down the solution you need to use. I’m not saying its perfect, there are a couple things here and there I’d like to change, but I’m hard pressed to find anything as powerful as it.

Read Bill Bradford’s excellent review for a better look at Up.Time.

The Joy of IPMI

Thursday, December 27th, 2007

Chances are you’ve heard of the Intelligent Platform Management Interface, or IPMI. And chances are very good that you view it as little more than a way to remotely reboot servers. But IPMI is oh so much more than that… wonders await you, should you just take the time to explore a little. So lets start with the basics and work outwards.

Almost any modern server is going to have a Baseboard Management Controller, or BMC for short, on the mainboard. On whitebox motherborads such as Tyan or ASUS this is normally an addon option but any purpose built server from Sun, HP, IBM, SuperMicro, etc, is going to have one on the board out of the box. The BMC acts as a hub for all the various sensor data on the board(s). In years past you might have heard of I2C and SMBus buses and sensors accessible via “lm-sensors”, the BMC is the hub for these various sensor buses. The BMC, therefore, has access to all the various sensors on a give system and therefore is rich with useful data. The most common way in which to retrieve that data is via IPMI.

IPMI can be accessed in several ways, these methods are refered to as “channels”, as in communications channels. The two most common are via the LAN (“lan”) or if your OS has an BMC driver via the local device (commonly “/dev/bmc”). Please note that in many places the acronyms MC and BMC are used interchangeably. Now, to exploit those channels the most common method is to use the Open Source “ipmitool”. This tool is found included with most OS’s (including Solaris, in /usr/sfw/bin) or can be downloaded on the IPMItool SourceForge page. Other projects exist, including OpenIPMI and GNU’s FreeIPMI. All these implementations offer a rich API for writing custom applications and CLI tools for interaction. As I said, IPMItool is by far the most common, so I’ll discuss it here.

First to clear up a common misconception… lets take a Dell PowerEdge server. Many are convinced that you need a Dell Remote Access Card (or Controller depending on who you ask), better known as a DRAC, in order to use IPMI. You do not! Service Processors (SP), such as Sun’s ILOM and ELOM (on X86, we’re ignoring SPARC here) or Dell’s DRAC, are not BMC’s, rather they are mini-computers on a card, typically running Linux, powered by “trickle” or “standby” power such that they are running even if the mainboard is not running. These cards simply act as a conduit to access the BMC and other functions of the system. The web interfaces on the SP’s, for instance, commonly are just passing IPMI commands back to the BMC. Thus, if you click “Power On” in the SP web interface your really just sending an IPMI “power on” command to the BMC. The point is, you do not need an SP to use IPMI with a system! The caviate is, depending on the architecture of the system, you may require an SP to talk to the BMC if the system is not running. For instance, on a Dell PowerEdge you can talk IPMI to the BMC without a DRAC by “Sharing” the first gigabit port, meaning that you really don’t need a DRAC at all unless you want the ability to, for instance, get SNMP data which is really just an SNMP agent on the DRAC pulling data from the BMC and returning it as OIDs and branded up as “Dell OpenManage”. To keep going with the Dell example, if you SSH onto a DRAC and use the “connect com2″ command to do serial redirection, your actually doing a local IPMI Serial-over-LAN session, your just doing it inside the chassis.

Okey, so, IPMI is everywhere. So what can we do with it? Like I said, most people are familiar with this:

$ ipmitool power status
Chassis Power is on

$ ipmitool chassis status
System Power         : on
Power Overload       : false
Power Interlock      : inactive
Main Power Fault     : false
Power Control Fault  : false
Power Restore Policy : always-off
Last Power Event     : command
Chassis Intrusion    : inactive
Front-Panel Lockout  : inactive
Drive Fault          : false
Cooling/Fan Fault    : false
Front Panel Control  : none

$ ipmitool chassis power cycle
...

In the above examples I’m using the local “bmc” communications channel. The command “power” is actually a shortcut for “chassis power”, so “ipmtool power cycle” and “ipmitool chassis power cycle” are the same thing. When the “-I (channel)” is not specified, local “bmc” channel is used. Here is a LAN example. (Those above was from a Sun X4150, below are Dell 2950):

$ ipmitool -I lanplus -H 10.0.50.60 -U root -f /ipmi.pass power status
Chassis Power is on

$ ipmitool -I lanplus -H 10.0.50.60 -U root -f /ipmi.pass chassis status
System Power         : on
Power Overload       : false
Power Interlock      : inactive
Main Power Fault     : false
Power Control Fault  : false
Power Restore Policy : always-off
Last Power Event     :
Chassis Intrusion    : inactive
Front-Panel Lockout  : inactive
Drive Fault          : false
Cooling/Fan Fault    : false
Sleep Button Disable : not allowed
Diag Button Disable  : allowed
Reset Button Disable : not allowed
Power Button Disable : allowed
Sleep Button Disabled: false
Diag Button Disabled : true
Reset Button Disabled: false
Power Button Disabled: true

The syntax above is fairly straight forward. I’m using the “lanplus” channel (the “lan” channel is for IPMI 1.5 commands, whereas “lanplus” is for IPMI 2.0 RMCP+), -H specifies the IP address of the IPMI interface, -U is the IPMI user (typically “root”). In recent releases of ipmitool the -P “password” option has been replaced with “-f /file”, the file contains the password in plaintext, this ensures that the IPMI password isn’t viewable via a process listing which is seen via “ps -ef” or the SNMP process tables. The default password on Dell PowerEdge servers is “calvin”, on Sun Fire servers its “changeme”, in both cases the user is “root”.

So… what else is there to see? There are two really interesting things to look at…

The first is the Sensor Data Repository (SDR). Here you will find thresholds and values for all the available sensors. Here is an example on a Sun X4100 M2:

$ ipmitool sdr elist
sys.id           | 00h | ok  | 23.0 | State Asserted
sys.intsw        | 01h | ok  | 23.0 |
sys.psfail       | 02h | ok  | 23.0 | Predictive Failure Deasserted
sys.tempfail     | 03h | ok  | 23.0 | Predictive Failure Deasserted
sys.fanfail      | 04h | ok  | 23.0 | Predictive Failure Deasserted
mb.t_amb         | 05h | ok  |  7.0 | 34 degrees C
mb.v_bat         | 06h | ok  |  7.0 | 2.88 Volts
mb.v_+3v3stby    | 07h | ok  |  7.0 | 3.18 Volts
mb.v_+3v3        | 08h | ok  |  7.0 | 3.34 Volts
mb.v_+5v         | 09h | ok  |  7.0 | 5.02 Volts
mb.v_+12v        | 0Ah | ok  |  7.0 | 12.10 Volts
mb.v_-12v        | 0Bh | ok  |  7.0 | -12.35 Volts
mb.v_+2v5core    | 0Ch | ok  |  7.0 | 2.54 Volts
mb.v_+1v5core    | 0Dh | ok  |  7.0 | 1.53 Volts
mb.v_+1v2core    | 0Eh | ok  |  7.0 | 1.22 Volts
fp.t_amb         | 14h | ok  | 12.0 | 24 degrees C
pdb.t_amb        | 1Bh | ok  | 19.0 | 23 degrees C
io.t_amb         | 22h | ok  | 15.0 | 22 degrees C
bp.power         | 0Fh | ok  | 13.1 | State Deasserted
bp.locate        | 10h | ok  | 13.2 | State Deasserted
bp.locate.btn    | 11h | ok  | 13.2 | State Deasserted
bp.alert         | 12h | ok  | 13.3 | State Deasserted
fp.prsnt         | 13h | ok  | 12.0 | Device Present
fp.usbfail       | 15h | ok  | 12.0 | Predictive Failure Deasserted
fp.power         | 16h | ok  | 12.1 | State Asserted
fp.locate        | 17h | ok  | 12.2 | State Deasserted
fp.locate.btn    | 18h | ok  | 12.2 | State Deasserted
fp.alert         | 19h | ok  | 12.3 | State Deasserted
fp.ledbd.prsnt   | 1Ah | ok  | 12.0 | Device Present
ps0.prsnt        | 1Ch | ok  | 10.0 | Device Present
ps0.vinok        | 1Eh | ok  | 10.0 | State Asserted
ps0.pwrok        | 1Dh | ok  | 10.0 | State Asserted
ps1.prsnt        | 1Fh | ok  | 10.1 | Device Absent
ps1.vinok        | 21h | ns  | 10.1 | Disabled
ps1.pwrok        | 20h | ns  | 10.1 | Disabled
io.id0.prsnt     | 23h | ok  | 15.0 | Device Present
io.id1.prsnt     | 24h | ok  | 15.0 | Device Absent
io.hdd0.fail     | 25h | ok  |  4.0 | Predictive Failure Deasserted
io.hdd1.fail     | 26h | ok  |  4.1 | Predictive Failure Deasserted
io.hdd2.fail     | 27h | ok  |  4.2 | Predictive Failure Deasserted
io.hdd3.fail     | 28h | ok  |  4.3 | Predictive Failure Deasserted
p0.t_core        | 29h | ok  |  3.0 | 24 degrees C
p0.v_vdd         | 2Ah | ok  |  3.0 | 1.38 Volts
p0.v_vddio       | 2Bh | ok  |  3.0 | 1.85 Volts
p0.v_vtt         | 2Ch | ok  |  3.0 | 0.91 Volts
p0.fail          | 2Dh | ok  |  3.0 | Predictive Failure Deasserted
p0.d0.fail       | 2Eh | ok  | 32.0 | Predictive Failure Deasserted
p0.d1.fail       | 2Fh | ok  | 32.1 | Predictive Failure Deasserted
p0.d2.fail       | 30h | ok  | 32.2 | Predictive Failure Deasserted
p0.d3.fail       | 31h | ok  | 32.3 | Predictive Failure Deasserted
p1.t_core        | 32h | ok  |  3.1 | 21 degrees C
p1.v_vdd         | 33h | ok  |  3.1 | 1.38 Volts
p1.v_vddio       | 34h | ok  |  3.1 | 1.85 Volts
p1.v_vtt         | 35h | ok  |  3.1 | 0.91 Volts
p1.fail          | 36h | ok  |  3.1 | Predictive Failure Deasserted
p1.d0.fail       | 37h | ok  | 32.4 | Predictive Failure Deasserted
p1.d1.fail       | 38h | ok  | 32.5 | Predictive Failure Deasserted
p1.d2.fail       | 39h | ok  | 32.6 | Predictive Failure Deasserted
p1.d3.fail       | 3Ah | ok  | 32.7 | Predictive Failure Deasserted
ft0.fm0.fail     | 3Bh | ok  | 29.0 | Predictive Failure Deasserted
ft0.fm1.fail     | 3Ch | ok  | 29.1 | Predictive Failure Deasserted
ft0.fm2.fail     | 3Dh | ok  | 29.2 | Predictive Failure Deasserted
ft1.fm0.fail     | 3Eh | ok  | 29.3 | Predictive Failure Deasserted
ft1.fm1.fail     | 3Fh | ok  | 29.4 | Predictive Failure Deasserted
ft1.fm2.fail     | 40h | ok  | 29.5 | Predictive Failure Deasserted
ft0.fm0.f0.speed | 41h | ok  | 29.0 | 7900 RPM
ft0.fm2.f0.speed | 43h | ok  | 29.2 | 7200 RPM
ft0.fm1.f0.speed | 42h | ok  | 29.1 | 7400 RPM
ft1.fm0.f0.speed | 44h | ok  | 29.3 | 9200 RPM
ft1.fm1.f0.speed | 45h | ok  | 29.4 | 9100 RPM
ft1.fm2.f0.speed | 46h | ok  | 29.5 | 8400 RPM
ft0.fm0.f1.speed | 47h | ok  | 29.0 | 7800 RPM
ft0.fm1.f1.speed | 48h | ok  | 29.1 | 7400 RPM
ft0.fm2.f1.speed | 49h | ok  | 29.2 | 7100 RPM
ft1.fm0.f1.speed | 4Ah | ok  | 29.3 | 9100 RPM
ft1.fm1.f1.speed | 4Bh | ok  | 29.4 | 9000 RPM
ft1.fm2.f1.speed | 4Ch | ok  | 29.5 | 8400 RPM

You can see that some of these are boolean failure warnings, such as “io.hdd0.fail”. By using the “elist” option the status is de-referenced, so we can see that its set as “Predictive Failure Deasserted” (with out “elist” this reports as 0×01). The fans, however output the speed and the temp sensors output the current reading.

While a full dump of the sensor repository is neat to look at, you’ll want to cherry pick values for practical purposes such as monitoring. For instance, lets get just the motherboard ambient temperature reading using “sdr”s sister command “sensor”:

$ ipmitool sensor reading "mb.t_amb"
mb.t_amb         | 34

If we want to feed this value to our monitoring application, such as Zabbix, Nagios, Cacti, and friends, we just parse that to display only the value, and we’re good to go:

$ ipmitool sensor reading "mb.t_amb" | awk '{print $3}'
34

We can apply the same method to any thing else in the SDR, allowing us to create pretty graphs and useful alerts based on voltages, fan speed, temperatures, or failure warnings. If you want greater clarity into a given sensor item, use “sensor get”, example:

$ ipmitool sensor get 'mb.t_amb'
Locating sensor record...
--
BMC req.fn         : 0x4
BMC req.lun        : 0x0
BMC req.cmd        : 0x2d
BMC req.datalength : 0x1
BMC req.data       : 0x5
--
--
BMC req.fn         : 0x4
BMC req.lun        : 0x0
BMC req.cmd        : 0x27
BMC req.datalength : 0x1
BMC req.data       : 0x5
--
Sensor ID              : mb.t_amb (0x5)
 Entity ID             : 7.0
 Sensor Type (Analog)  : Temperature
 Sensor Reading        : 34 (+/- 0) degrees C
 Status                : ok
 Lower Non-Recoverable : na
 Lower Critical        : na
 Lower Non-Critical    : na
 Upper Non-Critical    : 70.000
 Upper Critical        : 75.000
 Upper Non-Recoverable : 80.000
--
BMC req.fn         : 0x4
BMC req.lun        : 0x0
BMC req.cmd        : 0x2b
BMC req.datalength : 0x1
BMC req.data       : 0x5
--
--
BMC req.fn         : 0x4
BMC req.lun        : 0x0
BMC req.cmd        : 0x29
BMC req.datalength : 0x1
BMC req.data       : 0x5
--
 Assertions Enabled    : ucr+ unr+
 Deassertions Enabled  : ucr+ unr+

This output helps clarify more explicitly the various thresholds, this information is also useful to you monitoring or reporting solution. Spend some time on your platform playing with the “sdr” and “sensor” commands, hours of fun.

The second important feature is the System Event Log (SEL), it is exactly what you think it is:

$ ipmitool sel elist
 100 | 08/21/2007 | 13:25:45 | Voltage mb.v_+1v2core | Lower Non-critical going low  | Reading 0 < Threshold 1 Volts
 200 | 08/21/2007 | 13:25:45 | Voltage p0.v_vdd | Lower Non-critical going low  | Reading 0 < Threshold 1.00 Volts
 300 | 08/21/2007 | 13:25:46 | Power Supply ps0.pwrok | State Asserted
 400 | 08/21/2007 | 13:25:46 | Processor p0.fail | Predictive Failure Asserted
 500 | 08/21/2007 | 13:25:48 | Power Supply ps1.pwrok | State Asserted
 600 | 08/21/2007 | 13:25:50 | Voltage mb.v_+1v2core | Lower Non-critical going high | Reading 1.22 > Threshold 1 Volts
 700 | 08/21/2007 | 13:25:50 | Voltage p0.v_vdd | Lower Non-critical going high | Reading 1.38 > Threshold 1.00 Volts
 800 | 08/21/2007 | 13:25:52 | System Firmware Progress | Motherboard initialization | Asserted
 900 | 08/21/2007 | 13:25:52 | System Firmware Progress | Video initialization | Asserted
 a00 | 08/21/2007 | 13:25:58 | System Firmware Progress | USB resource configuration | Asserted
 b00 | 08/21/2007 | 13:26:09 | System Firmware Progress | Option ROM initialization | Asserted
 c00 | 08/21/2007 | 13:26:53 | System Firmware Progress | User-initiated system setup | Asserted
 d00 | 08/21/2007 | 13:27:11 | System Firmware Progress | Motherboard initialization | Asserted
 e00 | 08/21/2007 | 13:27:11 | System Firmware Progress | Video initialization | Asserted
 f00 | 08/21/2007 | 13:27:17 | System Firmware Progress | USB resource configuration | Asserted
1000 | 08/21/2007 | 13:27:25 | Power Supply ps0.pwrok | State Deasserted
1100 | 08/21/2007 | 13:27:27 | Power Supply ps1.pwrok | State Deasserted
1200 | Pre-Init Time-stamp   | Power Supply ps1.vinok | State Asserted
1300 | Pre-Init Time-stamp   | Entity Presence ps1.prsnt | Device Present
1400 | Pre-Init Time-stamp   | Power Supply ps0.pwrok | State Deasserted
1500 | Pre-Init Time-stamp   | Power Supply ps0.vinok | State Deasserted
1600 | Pre-Init Time-stamp   | Physical Security sys.intsw | General Chassis intrusion | Asserted
1700 | Pre-Init Time-stamp   | Entity Presence ps0.prsnt | Device Present
1800 | Pre-Init Time-stamp   | Power Supply ps1.pwrok | State Asserted
1900 | 11/14/2007 | 21:34:32 | System Firmware Progress | Motherboard initialization | Asserted
1a00 | 11/14/2007 | 21:34:32 | System Firmware Progress | Video initialization | Asserted
1b00 | 11/14/2007 | 21:34:38 | System Firmware Progress | USB resource configuration | Asserted
1c00 | 11/14/2007 | 21:35:09 | System Firmware Progress | Option ROM initialization | Asserted
1d00 | 11/14/2007 | 21:35:48 | System Firmware Progress | Motherboard initialization | Asserted
1e00 | 11/14/2007 | 21:35:48 | System Firmware Progress | Video initialization | Asserted
1f00 | 11/14/2007 | 21:35:54 | System Firmware Progress | USB resource configuration | Asserted
2000 | 11/14/2007 | 21:36:25 | System Firmware Progress | Option ROM initialization | Asserted
2100 | 11/14/2007 | 21:44:26 | System Firmware Progress | Motherboard initialization | Asserted
2200 | 11/14/2007 | 21:44:26 | System Firmware Progress | Video initialization | Asserted
2300 | 11/14/2007 | 21:44:32 | System Firmware Progress | USB resource configuration | Asserted
2400 | 11/14/2007 | 21:45:03 | System Firmware Progress | Option ROM initialization | Asserted
2500 | 11/14/2007 | 21:45:50 | System Firmware Progress | System boot initiated | Asserted
2600 | 11/14/2007 | 21:59:17 | Power Supply ps1.pwrok | State Deasserted
2700 | 11/14/2007 | 21:59:22 | Power Supply ps1.pwrok | State Asserted
2800 | 11/14/2007 | 21:59:35 | System Firmware Progress | Motherboard initialization | Asserted
2900 | 11/14/2007 | 21:59:35 | System Firmware Progress | Video initialization | Asserted
2a00 | 11/14/2007 | 21:59:41 | System Firmware Progress | USB resource configuration | Asserted
2b00 | 11/14/2007 | 22:00:12 | System Firmware Progress | Option ROM initialization | Asserted
2c00 | 11/14/2007 | 22:00:56 | System Firmware Progress | System boot initiated | Asserted
2d00 | 11/14/2007 | 22:14:01 | Power Supply ps1.pwrok | State Deasserted
2e00 | 11/14/2007 | 22:14:06 | Power Supply ps1.pwrok | State Asserted
2f00 | 11/14/2007 | 22:14:20 | System Firmware Progress | Motherboard initialization | Asserted
3000 | 11/14/2007 | 22:14:20 | System Firmware Progress | Video initialization | Asserted
3100 | 11/14/2007 | 22:14:26 | System Firmware Progress | USB resource configuration | Asserted
3200 | 11/14/2007 | 22:14:57 | System Firmware Progress | Option ROM initialization | Asserted
3300 | 11/14/2007 | 22:15:42 | System Firmware Progress | System boot initiated | Asserted
3400 | 11/14/2007 | 22:23:46 | System Firmware Progress | Motherboard initialization | Asserted
3500 | 11/14/2007 | 22:23:46 | System Firmware Progress | Video initialization | Asserted
3600 | 11/14/2007 | 22:23:52 | System Firmware Progress | USB resource configuration | Asserted
3700 | 11/14/2007 | 22:24:03 | System Firmware Progress | Option ROM initialization | Asserted
3800 | 11/14/2007 | 22:24:46 | System Firmware Progress | System boot initiated | Asserted
3900 | 11/14/2007 | 23:36:17 | Power Supply ps1.pwrok | State Deasserted
3a00 | Pre-Init Time-stamp   | Power Supply ps0.pwrok | State Deasserted
3b00 | Pre-Init Time-stamp   | Power Supply ps0.vinok | State Asserted
3c00 | Pre-Init Time-stamp   | Entity Presence ps0.prsnt | Device Present
3d00 | Pre-Init Time-stamp   | Power Supply ps0.pwrok | State Asserted
3e00 | 11/15/2007 | 01:36:22 | System Firmware Progress | Motherboard initialization | Asserted
3f00 | 11/15/2007 | 01:36:22 | System Firmware Progress | Video initialization | Asserted
4000 | 11/15/2007 | 01:36:28 | System Firmware Progress | USB resource configuration | Asserted
4100 | 11/15/2007 | 01:36:59 | System Firmware Progress | Option ROM initialization | Asserted
4200 | 11/15/2007 | 01:37:44 | System Firmware Progress | System boot initiated | Asserted

So here we see this event history of our system. Both Dell and Sun SP’s and firmware use this event log to send warnings and such, for instance if you want to clear a mysterious warning light on a Dell’s chassis just clear the SEL.

The SEL’s best friend is Platform Event Filtering (PEF). Here we can create rules which dictate alerting policy. When a given event occurs that matches a PEF rule, an alert in the form of an SNMP trap is sent, which is called a “Platform Event Trap” (PET). The default event rules list is short on Sun X4100:

$ ipmitool pef list
 1 | active, pre-configured | 0x11 | Any | Any | Warning | OEM | OEM | Alert,OEM-defined | 2
 2 | active, pre-configured | 0x11 | Any | Any | Critical | OEM | OEM | Alert,OEM-defined | 3
 3 | active, pre-configured | 0x11 | Any | Any | Non-recoverable | OEM | OEM | Alert,OEM-defined | 4
 4 | active, pre-configured | 0x11 | Any | Any | Information | OEM | Any | Alert,OEM-defined | 1

On the Dell’s its a bit more fine grained:

$ ipmitool -I lanplus -H 10.0.50.60 -U root -f /ipmi.pass pef list
 1 | active | 0x11 | Fan | Any | Critical | Threshold | (0x01/0x0004),

But where do these traps go? Thats defined by the PEF policy which is commonly configured via your SP, in the case of Sun systems this would be using the ELOM/ILOM interface, in the case of Dell you can do this in the BIOS.

$ ipmitool -I lanplus -H 10.0.50.60 -U root -f /ipmi.pass pef policy
 1 | 1 | Match-always | 1 | 802.3 LAN | PET | public | 0 | 0 | 0.0.0.0 | 00:00:00:00:00:00
 2 | 1 | Match-always | 1 | 802.3 LAN | PET | public | 0 | 0 | 0.0.0.0 | 00:00:00:00:00:00
 3 | 1 | Match-always | 1 | 802.3 LAN | PET | public | 0 | 0 | 0.0.0.0 | 00:00:00:00:00:00
 4 | 1 | Match-always | 1 | 802.3 LAN | PET | public | 0 | 0 | 0.0.0.0 | 00:00:00:00:00:00

While these two, SDR and SEL, are extremely useful there is one more IPMI feature that you may not even be aware of... Serial over LAN, or SoL for short. It does what it sounds like, console serial redirection via IPMI over the LAN! This means that in systems that once required a console server that was physically connected to each servers serial port can now be access simply using "ipmitool". This feature was introduced in IPMI v1.5 and almost all modern generation servers support it and, as noted earlier, some SP's console redirect (such as DRAC 'connect com2') is in fact IPMI SoL in disguise.

$ ipmitool -I lanplus -H 10.0.50.60 -U root -f /ipmi.pass sol activate
[SOL Session operational.  Use ~? for help]
#  <-- this is a console prompt

The controls are those of SSH, use tilde-dot (~.) to disconnect. IPMI SoL isn't fool proof however, I've run into several instances where 'ipmitool' would segfault and dump my connection for no repeatable reason, and I've seen this both "remotely" from another system and on a DRAC, so don't throw away all your console server just yet, but there are plenty of cases where it can sure come in handy.

So, let me recap since this is a lot to digest if your new to it:

  • IPMI is your friend.
  • Modern server motherboards possess a Baseboard Management Controller (BMC) which is the heart of your box accessed via IPMI.
  • SP's are not BMC's, they just provide lights out access to it.
  • IPMI is useful for more than checking power status and power cycling.
  • The IPMI Sensor Data Repository (SDR) is accessable with the IPMItool "sdr" and "sensor" commands and provide access to all system sensors.
  • IPMItool can be used locally and remotely, didn't buy an SP or forgot to configure LAN settings in the BIOS? Just install IPMItool locally and see if you can say hello!
  • Sensor data can easily be output and formatted to be input into your monitoring solution whether it be Uptime, Nagios, Zabbix or Cacti.
  • The System Event Log (SEL) can provide meaningful insite into previous events that the OS may not have been aware of
  • Platform Event Filtering (PEF) can be used to alert on specified events from the SEL, sending Platform Event Traps (PET) in SNMP Trap format, providing a means for asynchornously alerting on error conditions.
  • IPMI provides remote console capability, IPMI Serial-over-LAN (SoL), which can provide a low-cost/no-cost remote console access method where no other solution may be applicable. Can't access a system? Try SoL before you power cycle!
  • Whats in that box? IPMItool can also output a FRU list (ipmitoo fru) to assist in your auditing needs.
  • ... and much more.

I hope this gives you a new appreciation for just what IPMI can do for you and how you might be able to exploit it. In many cases its there, right now, waiting to be used on your servers, just because you didn't assign an IP address doesn't mean you can't use it, so please install IPMItool and give it a shot. If you had to buy servers with out a DRAC or SP, don't dispare, you're not missing out as much as you think.