Archive for June, 2009

Solaris Automated File Integrity Checking: bartlog

Thursday, June 25th, 2009

The Solaris Basic Audit & Reporting Tool, bart, is a great little alternative to Tripwire or AIDE. While not nearly so robust or full featured, it does what you need it to do with very little impact. The sqlite of intrusion detection systems, if you will. I blogged about BART in 2005 and so far its still only got 1 real comment, which was simply mentioning AIDE as an alternative. No love.

Given that BART is awesome and no one seems to embrace it due to, perhaps, perceptions of complexity that are unfounded, I sought to implement a simple solution to bring BART to the masses. I call it bartlog

Quite simply, bartlog is a BASH wrapper around BART and logger which is run from cron on any schedule you like and reports any changes to syslog. Setup is simple, download bartlog and copy into /usr/sbin or whereever you prefer, then download bart.rules and copy into /etc. Now run bartlog from cron every hour or day or whatever you like.

The script is simple and intended to be tweeked, modified and made as l337 as you like. What it does is creates a BART manifest (record of files and MD5 checksums) for those directory structures specified in the bart.rules file. The first time it runs it just creates a manifest and exits. The second time you run it it creates a new manifest and then compares it against the previously created one. If it doesn’t find any changes it just replaces the old manifest with the new one, this avoids you getting repetitiously alerted. However, if it does find a change it sends the change to syslog, so that its stored with your normal logs viewed either by running dmesg or reading /var/adm/messages. By default I’m using the syslog audit.err priority because by default Solaris sends those messages to /var/adm/messages, however if you are deploying this in a production environment I’d recommend using audit.warn instead and then modifying /etc/syslog.conf to send those warnings to a secure centralized syslog server. If you complete the solution with Splunk you could have a centralized, searchable log of all changes to critical files on which you could report, respond or alert on.

I use a one hour interval on my home workstation. Here’s my syslog following a new user addition:

root@quadra ~$ dmesg
...
Jun 25 11:01:58 quadra root: [ID 702911 audit.error] BART Reports Change: /etc/.pwd.lock mtime 4a218d04 4a43b0bd
Jun 25 11:01:58 quadra root: [ID 702911 audit.error] BART Reports Change: /etc/opasswd size 968 985 mtime 49fa4236 4a218d2b contents fc27c5b28b3a248b6c6129aa9aed7329 2200107fc7128d5cd38de333bea4500f
Jun 25 11:01:58 quadra root: [ID 702911 audit.error] BART Reports Change: /etc/ouser_attr mtime 4a04a741 4a218d01
Jun 25 11:01:58 quadra root: [ID 702911 audit.error] BART Reports Change: /etc/passwd size 985 1022 mtime 4a218d2b 4a43b0ac contents 2200107fc7128d5cd38de333bea4500f 640da69537a35046571b4fda1def10d1
Jun 25 11:01:58 quadra root: [ID 702911 audit.error] BART Reports Change: /etc/shadow size 708 783 mtime 4a218d04 4a43b0bd contents f83158dffddc124dab2f22a979338695 6ba7d42600da8d4fc9b8a92f4bf0afe7
Jun 25 11:01:58 quadra root: [ID 702911 audit.error] BART Reports Change: /etc/user_attr mtime 4a218d01 4a43b0ac

So I hope this fills the hole. Anyone running a Solaris system at home can download these two files, add to cron and be off and running. No hassle, no maintenance. All the love, none of the pain. If your running a system where bart isn’t installed, just install SUNWbart from IPS or the install media.

Crashing Solaris for Fun and Profit

Tuesday, June 9th, 2009

Crashing is the wrong title actually. We’re talking about panics. Its sort of like saying “hacking” when you mean “cracking”. A “crash” is when an OS preforms some operation that typically causes the system to reboot. Solaris is very unique from rival Linux in that 99% of the time such an event will be caught by the OS and handled as a “panic” instead of an uncontrolled crash.

While its not a sexy feature of Solaris, panics are extremely important things. The reality is that sh*t happens, it just does. When it does happen you want to collect as much information about the event as possible to fix the problem that resulted in badness. Therefore, the advantage of a panic over a crash is a crash dump which can be analyzed for the cause. This is why panics are a very good thing indeed. As I like to say, “If you’ve gotta crash, fine, but you’d better give me a reason!”

This blog entry is interested in the 1% of Solaris issues that don’t result in a panic, but you wish would. In between the concepts of a “crash” and a “panic” is that dreaded situation we call “hung”. A “hung” or “locked” or “wedged” system is one that is still running, technically, but is otherwise unusable. Typically this is the result of the kernel being ok but the userland stack being trashed beyond repair. Perhaps the two most common causes of this are abuse cases such as memory/swap exhaustion or fork bombs, where the kernel is still running but processes can’t spawn to let you see whats happening. The normal way to deal with this is to reboot the system, either via IPMI or sending some poor soul (typically, you) down to the data center to unceremoniously press the big button.

So here’s the problem with rebooting a system in a bad way…. it doesn’t panic, meaning you might get the system back up post-reboot but you have no idea what happened unless something in the logs tips you off. If there are no relevant logs entries you have to simply shrug and pray that it doesn’t happen again. So what can you do about it? On SPARC systems you’d hit “STOP-A” and sync the box. But, what about Solaris/X86…. what do you do then?

Thankfully there are three very handy ways of dealing with such situations. They can be combined or used individually. Lets take them in turn.

Please note! This entry applies only to Solaris/X86!

Panic on NMI

Intel introduced a concept of Non-Maskable Interrupts a long while back. These NMI’s are extremely high priority and can not be blocked by the OS. While I’ve had trouble fully researching them, the most common use is to kick an otherwise unresponsive system into a “diagnostic” mode. On some systems its implemented as a jumper, others a button, yet other IPMI or SP commands. In the case of Solaris/X86 there are two tunables that can cause the OS to react to NMI; one causes a panic, yet another causes the system to drop into a kmdb session for live debugging. By default Solaris will simply output a message to console saying an NMI was received but otherwise do nothing.

The first of these two tunables can be added to /etc/system, it is not dynamic. You must reboot for it to take effect, it is:

set pcplusmp:apic_panic_on_nmi=1

Once this is added and the system rebooted the receipt of an NMI will cause an immediate panic followed by reboot. The most common way to invoke this behavior would be via the IPMI command: ipmitool -I lanplus -H somehost -U root chassis power diag. Remember, if you get a seriously stuck system your going to reboot the box anyway, typically via IPMI, so instead of using “power cycle” you choose “power diag” and have some tasty data to dig through (or send to Sun).

I like to call this feature “panic on demand”. :)

Enter kmdb On NMI

In addition to, or instead of, the panic option above, we can use the following tunable to drop the system into the kmdb kernel debugger on NMI receipt:

set pcplusmp:apic_kmdb_on_nmi=1

If it follow the panic option above it’ll panic and then drop into the debugger, which is the best option. But please know that this only will work if you have kmdb loaded at boot time by adding the -k kernel argument via your bootloader (ie: GRUB). If your working with a production system this might not be something you want hanging over your head all the time and thus a more developer oriented solution in my mind. If your writting drivers you’ll certainly want to keep kmdb loaded, but everyone else will more likely prefer the “panic and reboot” option above.

Snooping; The Deadman Watchdog Timer

Ever watch or read The Abyss? There was a mini-sub that would automatically float to the surface with surveillance tapes and such, if a timer wasn’t reset every 12 hours, in hopes that if anything went wrong there would at least be a partial record of events. Watch dog timers are a similar concept… the kernel uses some means to determine function and if it ceases to function the white flag is waived and it panics.

While its rare, the comment in code (line 163) is the best explanation: “Setting “snooping” to a non-zero value will cause a deadman panic if snoop_interval microseconds elapse without lbolt increasing. The default snoop_interval is 50 seconds.”

So we simply add the following tuning(s) to /etc/system:

set snooping=1
set snoop_interval=90000000

The first line enables snopping. The second line changes the default 50 second interval to 90 seconds. I don’t see any reason that 50 seconds isn’t long enough, but if you want to be paranoid and use a 5 minute interval you can, just change 300 seconds to microseconds and reboot to lock it in.

Snooping has been safe in all my testing to date, but obviously it will feel risky to the casual sysadmin, so this is not something I’d enable by default. If, however, you have a system that mysteriously goes into a dead hang in the middle of the night, this is a better option than being woken up just to testify that it did actually reboot and you still have no idea why. :)

Poking NMI

As stated, the best way to poke NMI once your ready for it is via IPMI (tested on Sun and Dell):

# ipmitool -I lanplus -H somebox -U root chassis power diag

If you have a moderately recent version of ILOM you can poke the /SP/diag/generate_host_nmi value like so:

-> cd /SP/diag
/SP/diag

-> show

 /SP/diag
    Targets:
        snapshot

    Properties:
        generate_host_nmi = (Cannot show property)
        state = disabled

    Commands:
        cd
        set
        show

-> set generate_host_nmi=true
Set 'generate_host_nmi' to 'true'

What Next?

One you’ve got a dump things are on the uptick, but given that you may not be a kernel developer, and if your wise have no desire to become one, you have some options. The first is to send the panic to Sun and hope they come back with a good answer. The second is to use this as your opportunity to dig deep into the guts of Solaris and learn something interesting. I recommend reading my blog entry some time ago: Solaris Core Analysis, Part 1: mdb and Part 2: Solaris CAT, reading The Solaris Operating System on x86 Platforms: Crashdump Analysis Operating System Internals (PDF), getting the latest SolarisCAT, and if possible attending one of Max Bruning’s courses.

Finally, a shout out to my p33ps in #opensolaris IRC for their help in assisting with my testing and research on NMI, in particular “LONGCAT”.

CommunityOne/JavaOne Wrapup

Thursday, June 4th, 2009

CommunityOne is done, and JavaOne is passing by. As usual a great show. There is no way to even compare other events with that of JavaOne and associated events. The energy was high this year, although under a big “what is Oracle going to do??” cloud of fear.

From a OpenSolaris perspective there were a great number of fantastic talks. Of special interest was emerging technologies that have tremendous disruptive potential, such as COMSTAR and Crossbow. As a major advocate of Crossbow I was glad that it go so much attention, especially in the CommunityOne General Session (ie: Morning Keynote).

Of course, the most important aspect of events like this is the people themselves. I missed a lot of sessions I wanted to attend due to “hallway chatter”. The personalities collected there is amazing. I can’t even make a list its so long.

This was the first year I actually stayed in a hotel for the show. That makes things all the more interesting. For instance, there was a Sun Employee party at 9PM on Monday. I suddenly realized how many of my friends are Sun employees. After unsuccessfully attempting to crash the party with John Plocher he and I hung and talked for a while untill he headed home. Then I was adrift… so I hit my hotel room, settled in and set out for dinner. Walking out of my hotel, on the sidewalk, was Max Bruning! Both of us were taken aback and he agreed to let me buy him a beer at the bar we were standing in front of.

Talking with Max is a unique and special treat. He is an amazing trainer, that comes though clearly. Unlike most uber-geeks there is no ego or superiority in him. You can talk with him about gaps in your knowledge and he’ll start talking about it in depth in a totally non-threatening way. This is something rare at conferences, where you feel like you have to hide your inabilities for fear of a swift degrading reprimand. Besides, how many people do you meet that you can talk freely about ZFS and Solaris Kernel internals with? I can’t tell you how much I valued the hour or two I got to talk with him. Once I got home I attacked some pending post-mortems with a renewed gusto. I only caught the last 1 hour of his Deep Dive session, but it was incredibly useful. I’ll post the video link as soon as its available.

As for my talks… I did 3, as I noted prior to the conference in this blog. My talks went pretty well. I opted to go the “entertaining and informative” route. The comments on each were pretty positive. In retrospect I could have done a lot of refining in my delivery, but for these types of events its always difficult to guess at the audience, so I shoot for the middle and go high energy to at least be fun to listen to. ;)

Anyway…. most of the talks were video taped and will be appearing online next week. (Post-processing takes a while.) Sun wanted live feeds on things, but weren’t able to get it done through their AV people.

Of interest to the community might have been the OpenSolaris Town Hall, a chance to grill the OGB as a panel. That session was not taped, but nothing happened. The OGB updated the community on what they are working on (all of which is available on the web) and opened the floor to questions. The problem was that it was at the same time that beer was being served at the CommunityOne party and most people seemed more interested in the beer than politics… consiquently there were no questions, perhaps because anything that should be said already has been. The vibe was that of resignation to the whole process and trust in the direction that Madam Michelle Olsen is going. Whats to discuss? So the event wrapped up quickly.

Thanks to everyone who went and participated. A special thanks to Teresa Giacomini, Lynn Rohrer, Markus Flierl, and especially Deirdre Straughan for allowing me the privilege of participating in such an excellent event.