A couple things here and there have kept me from continuing my series of posts regarding systems management solutions. One of the monitoring solutions I’ve planned to write about it Up.Time. While I haven’t had the time to write it, I was thrill to check my favorite site, SunHelp.org, and see that Super Admin Bill Bradford wrote an excellent review himself: Software Review: up.time 4 Enterprise Monitoring.
In my professional opinion, Up.Time is the best, most comprehensive, and most polished out of the box solution available at any price. Yes, its proprietary closed source commercial software… but, whether your using Zenoss, HP OpenView, Hyperic, or another other solution out there, your going to only get a small subset of monitoring capability without spending some time extending it yourself or digging around for modules written by someone else. Most, such as NetNMS or Zenoss, are limited by the OIDs exposed by SNMP and then extended by creating custom scripts that SSH into boxes every n seconds. Others such as Zabbix and Hyperic provide a client side agent that gathers up fairly generic information such as disk usage, CPU and memory usage, and maybe an odd and end on top. But Up.Time gathers a massive range of metrics, stores them all, and provides useful graphing and reporting capabilities, including report automation, to make it all very useful.
I’ve solved more than a few problems because of the realization that the historical data I needed to analyze a problem was already right under my nose because Up.Time had been gathering it and I didn’t even realize it. A great example is IO response time! I spent quite a bit of time ripping apart iostat.c to learn how to extend Zabbix, Hyperic, or other solutions to record a_svct… then I realized that all that data was already being gathered with Up.Time right out of the box. Not only does it gather single return metrics, it also stores useful multi-string return data such as the top CPU consuming processes during a given period. Just knowing that the CPU was saturated on Monday of last week isn’t enough! What was actually using that CPU? Up.Time can tell you, no modification required.
With all my searching to date, there is only one “install and forget” solution on the market, and thats Up.Time. If you want to solve your monitoring problems with money its hands down the solution you need to use. I’m not saying its perfect, there are a couple things here and there I’d like to change, but I’m hard pressed to find anything as powerful as it.