Solaris Extended Accounting in the Real World

Posted on October 21, 2009

Recently we talked about Solaris Auditing (BSM) in the Real World. Like BSM, Extended Accounting is a fantastic feature of Solaris that is utterly useless without tools. Solaris goes so far as giving you the capability but not so far as to hand you the rest of the solution on a silver platter. On one hand this means that the technology isn’t pigeon holed due to the capabilities of a single tool, but at the same time it creates a barrier to entry that causes many people to simply ignore it all together. So, yet again, let me provide a simple tool to fill some of that void.

In a previous post, Solaris Extended Accounting, I described Extended Accounting and provided two scripts to get you started, one was a PERL script to dump Extended Accounting (“exacct”) data files and the other was called “prettyproc” which output Proccess Accounting files in a more human friendly way. This post should be viewed as Part 2 of that post.

When & How to use Extended Accounting

The most basic explanation of Extended Accounting is this: a facility that records certain events upon completion for later analysis. Those certain events depend on which of the four accounting types we’re using. For processes, the cumulative data maintained by Solaris microstate accounting is written into a single record as process termination. For tasks, which are groups of processes within a single project, the same applies but recorded on each task termination rather than process. For (Crossbow) net, aggregate network utilization is written out on regular intervals (15 seconds). We’ll ignore IPQoS “flow” Accounting entirely for the time being.

So the first thing we should say is that Extended Accounting is not a monitoring facility. If you want to know how much CPU or Memory is being used at some given time you should rely upon Kstats or /proc statistics on a polling schedule.

What Extended Accounting is good for is reporting. Consider ‘net’ accounting; every 15 seconds a record is created for each data link (dladm show-link). You could easily create a report at some interval (hour, day, week, month?) for both total bytes/packets sent/recieved on each link or great a graph or perhaps most likely calculate 95th percentile on the links. Now, in this case of ‘net’ accounting you could also use an external system to poll the data remotely via SNMP or locally via kstats, but this might serve as a better “definiative” local record.

Proc accounting is fuzzy ground though. The best way I can explain process accounting is to imagine that every time you executed a command Solaris was secretly running “time(1M)” and then storing the output on your behalf.

benr@quadra Downloads$ time tar xfj flash_player_10_solaris_x86.tar.bz2 

real    0m0.763s
user    0m0.705s
sys     0m0.070s

This is, essentially, whats happening! Solaris maintains a lot of detail on what processes are doing (known as “microstate accounting”). Normally, when a process terminates that data is simply discarded, however if Process Extended Accounting is enabled its dumped out as a record! From this record we can see interesting stats such as when the process started, when it finished (real time), how long it spent cpu time in kernel-land (sys time), how long it spent cpu time in user-land (user time), how many context switches it made, how much swapping it did, what its average RSS memory usage was, etc, etc, etc.

But as wonderful as this is, I have to make it crystal clear that this data isn’t written out untill a process terminates! If MySQL runs for 4 months, it outputs a single record when it was finally shut down, and that record is the accumulation of that full 4 months of running!

Here is the exception. Proc and Task records can be “full” or “partial”. When a process/task terminates and creates a record, that’s a “full record”. However, using “wracct” we can force a process or task to create a “partial record”, which is essentially a way of saying “Just tell me what you’ve got so far!” The rub is that, in the proc case, that data is cumulative, so if you wanted to report on what a process has done in the last 24 hours you need to write a partial record every 24 hours and the find the difference between the partial record yesterday and partial record today. Talk about fun.

Now, besides all that, who actually bills users or reports usage based on total CPU time? Total context switches? This isn’t the 1970’s nor is this likely to be a Super Computer reporting computational time. In short, the data probably isn’t terribly useful as a basis for billing in this day and age without some creative thought.

So then lets think… what can we determine from the data. Based on CPU usage we could determine what the top 5 CPU consuming processes were. Based on average RSS usage we could determine what the top memory consumers were. So on and so forth. Interesting perhaps… but worth it?

Go back to what I said about running “time” on every command. This data could be of used for capacity planning or, with some intelligence, behavior monitoring. Are your users complaining about commands taking too long to run, but when you ask how long they give you a bogus number or simply shrug? Extended Accounting can tell you. Are batch jobs running at night but want a record of when they started and how resource hungry they were? Here is a way that doesn’t involve writing wrappers!

In short, Extended Accounting is a pretty lousy billing system on todays mulit-core systems, but it can provide useful historical statistics to questions that might be otherwise difficult to answer.

Practical Tools

The first tool I’ll provide you with is a PERL replacement for the Solaris included /usr/demo/libexacct/exdump.c: exdebug.pl. This tool offers the following advantages:

  1. exdump.c hasn’t been updated for the new Crossbow provided ‘net’ accounting data; exdebug.pl is module agnostic and works with them all.
  2. The output is just much cleaner and intuitive for exploring what ExAcct can do for you.
  3. Its implemented in PERL making it easier to get in there and build something, rather than dealing with the libexacct learning curve in C. If nothing else you can quickly prototype and then re-implement in C.

Here is an example from an ‘net’ record:

benr@quadra exacct$ acctadm net
            Net accounting: active
       Net accounting file: /var/adm/exacct/net
     Tracked net resources: extended
   Untracked net resources: none
benr@quadra exacct$ pfexec ./exdebug.pl /var/adm/exacct/net | more
Creator:  SunOS
Hostname: quadra

---------------- OBJECT 0 -----------------------
Object is: EO_GROUP   -   Catalog: EXT_GROUP EXC_DEFAULT EXD_GROUP_NET_LINK_DESC
                Id: EXD_NET_DESC_NAME   Value: testzone0
                Id: EXD_NET_DESC_EHOST  Value: 
                Id: EXD_NET_DESC_EDEST  Value: 
                Id: EXD_NET_DESC_VLAN_TPID      Value: 0
                Id: EXD_NET_DESC_VLAN_TCI       Value: 0
                Id: EXD_NET_DESC_SAP    Value: 0
                Id: EXD_NET_DESC_PRIORITY       Value: 0
                Id: EXD_NET_DESC_BWLIMIT        Value: 0
                Id: EXD_NET_DESC_DEVNAME        Value: testzone0
                Id: EXD_NET_DESC_V4SADDR        Value: 0
                Id: EXD_NET_DESC_V4DADDR        Value: 0
                Id: EXD_NET_DESC_SPORT  Value: 0
                Id: EXD_NET_DESC_DPORT  Value: 0
                Id: EXD_NET_DESC_PROTOCOL       Value: 0
                Id: EXD_NET_DESC_DSFIELD        Value: 0
...
---------------- OBJECT 67 -----------------------
Object is: EO_GROUP   -   Catalog: EXT_GROUP EXC_DEFAULT EXD_GROUP_NET_LINK_STATS
                Id: EXD_NET_STATS_NAME  Value: e1000g1
                Id: EXD_NET_STATS_CURTIME       Value: 1256033841
                Id: EXD_NET_STATS_IBYTES        Value: 2411692067
                Id: EXD_NET_STATS_OBYTES        Value: 202604900
                Id: EXD_NET_STATS_IPKTS         Value: 2005669
                Id: EXD_NET_STATS_OPKTS         Value: 1265178
                Id: EXD_NET_STATS_IERRPKTS      Value: 0
                Id: EXD_NET_STATS_OERRPKTS      Value: 0
---------------- OBJECT 68 -----------------------
Object is: EO_GROUP   -   Catalog: EXT_GROUP EXC_DEFAULT EXD_GROUP_NET_FLOW_STATS
                Id: EXD_NET_STATS_NAME  Value: inbound_ssh
                Id: EXD_NET_STATS_CURTIME       Value: 1256033841
                Id: EXD_NET_STATS_IBYTES        Value: 93958770
                Id: EXD_NET_STATS_OBYTES        Value: 106077944
                Id: EXD_NET_STATS_IPKTS         Value: 238395
                Id: EXD_NET_STATS_OPKTS         Value: 321977
                Id: EXD_NET_STATS_IERRPKTS      Value: 0
                Id: EXD_NET_STATS_OERRPKTS      Value: 0
---------------- OBJECT 69 -----------------------
Object is: EO_GROUP   -   Catalog: EXT_GROUP EXC_DEFAULT EXD_GROUP_NET_LINK_STATS
                Id: EXD_NET_STATS_NAME  Value: testzone0
                Id: EXD_NET_STATS_CURTIME       Value: 1256033861
                Id: EXD_NET_STATS_IBYTES        Value: 4528169
                Id: EXD_NET_STATS_OBYTES        Value: 0
                Id: EXD_NET_STATS_IPKTS         Value: 64405
                Id: EXD_NET_STATS_OPKTS         Value: 0
                Id: EXD_NET_STATS_IERRPKTS      Value: 0
                Id: EXD_NET_STATS_OERRPKTS      Value: 0

In the above example you’ll see the variety of objects offered by the net accounting module, including link descriptions, link statistics (‘testzone0’ is a VNIC and ‘e1000g1’ is a physical interface), and flow statistics (inbound_ssh is a flowadm defined flow).

The second tool is exacctly, a human friendly Proc Extended Accounting dumper. It is also implemented in PERL and in fact was derived from the exdebug app above.

benr@quadra exacct$ acctadm proc
         Process accounting: active
    Process accounting file: /var/adm/exacct/proc
  Tracked process resources: extended
Untracked process resources: host
benr@quadra exacct$ pfexec ./exacctly /var/adm/exacct/proc | more
Creator:  SunOS
Hostname: quadra

      ZONE    UID    GID    PID                  CMD |   Real   User        Sys |               Start Date |    RSS AVG      RSS MAX     SysCalls      Swaps 
 ----------------------------------------------------+--------------------------+--------------------------+--------------------------------------------------
    global      0      0   1922              acctadm |   0.07   0.00       0.01 | Tue Oct 20 03:10:01 2009 |        524 K      12904 K        450          0 | FULL
    global      0      0   1920                   sh |   0.07   0.00       0.00 | Tue Oct 20 03:10:01 2009 |       2036 K      12904 K        103          0 | FULL
    global     25     25   1924             sendmail |   0.10   0.01       0.01 | Tue Oct 20 03:10:01 2009 |       1912 K      12904 K        543          0 | FULL
    global      0      0   1927             sendmail |   0.01   0.00       0.01 | Tue Oct 20 03:10:01 2009 |       2288 K      13172 K        267          0 | FULL
    global      0      0   1923                 mail |   0.10   0.00       0.00 | Tue Oct 20 03:10:01 2009 |        504 K      12904 K        169          0 | FULL
    global      0      0   1921                   sh |   0.11   0.00       0.00 | Tue Oct 20 03:10:01 2009 |        920 K      12904 K        102          0 | FULL

The output is really wide, but everyone should have a big ol’ screen these days. Notice the depth of information here. For each terminated process we see the zone it was in, user and group, PID and command name itself (ExAcct doesn’t record arguments), then we see real/sys/user time in seconds (ExAcct actually has nanosecond granularity, so these are rounded numbers), the start time and other goodness. The last column reports whether the record is full or partial.

This tool is, in and of itself, useful for many administrators to start using Extended Accounting that might otherwise have ignored it. Even more so, I hope it sparks your interest and imagination as to the possibilities! Just think of all the ways to amaze your boss and fellow admins!

Data File Rotation

Like any log, don’t be lazy and forget to rotate those files or you’ll have a mess on your hands. Rotating your extended accounting data files will make them easier to dissect and consume less disk. Here are some examples lines you can drop into /etc/logadm.conf, Solaris’s default log rotation tool:

/var/adm/exacct/proc -N -p 1d -C 7 -b '/usr/sbin/acctadm -x process' -a '/usr/sbin/acctadm -e extended -f /var/adm/exacct/proc process'
/var/adm/exacct/net -N -p 1d -C 7 -b '/usr/sbin/acctadm -x net' -a '/usr/sbin/acctadm -e extended -f /var/adm/exacct/net net'
/var/adm/exacct/task -N -p 1d -C 7 -b '/usr/sbin/acctadm -x task' -a '/usr/sbin/acctadm -e extended -f /var/adm/exacct/task task'

These examples will rotate each day (-p 1d) and keep 7 logs (-C 7) before destroying. The important bit is that you can’t just mv the file, you need to stop accounting, rotate, then resume it.

Remember to ensure that logadm isn’t commented out in the root crontab.

Parting Thoughts & Cautions

Before I wrap up, I want to note something about Process records. Here is one as seen with exdebug:

---------------- OBJECT 0 -----------------------
Object is: EO_GROUP   -   Catalog: EXT_GROUP EXC_DEFAULT EXD_GROUP_PROC
                Id: EXD_PROC_PID        Value: 1922
                Id: EXD_PROC_UID        Value: 0
                Id: EXD_PROC_GID        Value: 0
                Id: EXD_PROC_PROJID     Value: 1
                Id: EXD_PROC_TASKID     Value: 39949
                Id: EXD_PROC_CPU_USER_SEC       Value: 0
                Id: EXD_PROC_CPU_USER_NSEC      Value: 2047013
                Id: EXD_PROC_CPU_SYS_SEC        Value: 0
                Id: EXD_PROC_CPU_SYS_NSEC       Value: 6237135
                Id: EXD_PROC_START_SEC  Value: 1256033401
                Id: EXD_PROC_START_NSEC         Value: 311640743
                Id: EXD_PROC_FINISH_SEC         Value: 1256033401
                Id: EXD_PROC_FINISH_NSEC        Value: 380283918
                Id: EXD_PROC_COMMAND    Value: acctadm
                Id: EXD_PROC_TTY_MAJOR  Value: 4294967295
                Id: EXD_PROC_TTY_MINOR  Value: 4294967295
                Id: EXD_PROC_FAULTS_MAJOR       Value: 0
                Id: EXD_PROC_FAULTS_MINOR       Value: 0
                Id: EXD_PROC_MESSAGES_SND       Value: 0
                Id: EXD_PROC_MESSAGES_RCV       Value: 0
                Id: EXD_PROC_BLOCKS_IN  Value: 0
                Id: EXD_PROC_BLOCKS_OUT         Value: 0
                Id: EXD_PROC_CHARS_RDWR         Value: 20100
                Id: EXD_PROC_CONTEXT_VOL        Value: 102
                Id: EXD_PROC_CONTEXT_INV        Value: 0
                Id: EXD_PROC_SIGNALS    Value: 0
                Id: EXD_PROC_SWAPS      Value: 0
                Id: EXD_PROC_SYSCALLS   Value: 450
                Id: EXD_PROC_ACCT_FLAGS         Value: 2
                Id: EXD_PROC_ANCPID     Value: 1920
                Id: EXD_PROC_WAIT_STATUS        Value: 0
                Id: EXD_PROC_ZONENAME   Value: global
                Id: EXD_PROC_MEM_RSS_AVG_K      Value: 524
                Id: EXD_PROC_MEM_RSS_MAX_K      Value: 12904

Okey, lots of data, lots of goodness. Notice EXD_PROC_BLOCKS_IN, OUT, and CHARS_RDWR? They are useless. I can’t go into why here, but don’t get excited about them or bother doing anything, the values are crap. If your a veteran Kstat diver you’ll recognize similar values in the Kstat cpu_stat class… same story.

Hopefully this post as helped provide you with a more practical understanding of Extended Accounting and provided you with some resources to get in there and use the data. There is a wealth of possibilities if you just avail yourself of them. 🙂