Using Graphite to Graph DTrace Metrics

If you haven’t heard of Graphite you are missing out on a serious operations power tool. Let me make a gross over simplification and slightly inaccurate assertion to get you in the ballpark of understanding what it is: it’s RRDtool reimplemented for the web.

Let me be more specific for those new to it. Graphite is really made up of 3 components. The first is “Carbon” which is a metrics collection daemon that collects data via a UDP socket, caches the data and then records it to disk. The second is “Whisper” which is a round robin database that permanently stores your metrics on disk that is used by Carbon. The third is a Django app which can generate graphs based on your metrics via a snazzy web UI or via a simple URL API. So it implements an RRD database like RRDtool and a means of graphing the data like RRDtool but its accessible via a browser and graphs dynamically, so unlike RRDtool it isn’t necessary to pre-render static graphs at some interval.

There are 3 reasons I really find it hard to ignore Graphite. Firstly, you do not need to pre-generate your databases, if you send it a metric it hasn’t gotten before it just creates the database based on a flexible schema configuration. Secondly, you can get your graphs essentially in real-time by just refreshing a URL, no pre-generation. Thirdly, you can send it metrics using something as simple as netcat. The result is an insanely flexible metrics graphing system with very little configuration required and no agents necessarily.

So let me demonstrate how we can use all this power together with DTrace in a sample script:

#!/bin/bash
# Example DTrace/Graphite Integration
# Ben Rockwood 

export HOSTNAME=`hostname`
export GRAPHITE_SERVER="10.0.0.22";

/usr/sbin/dtrace -n '

#pragma D option destructive
#pragma D option quiet

BEGIN
{
        mycounter = 0;
}

syscall::read:entry
{
        mycounter++;
}

tick-1sec
{
        /* system("echo \"DEBUG: Sending data to metric dtrace.$HOSTNAME.syscall.read.entry
                                    on server $GRAPHITE_SERVER\" "); */
        system("echo \"dtrace.$HOSTNAME.syscall.read.entry %d %d\" | nc $GRAPHITE_SERVER 2003 ",
                     mycounter, walltimestamp / 1000000000);
        mycounter = 0;
}
'

So what I’m doing here is running a DTrace script via BASH. I’m using BASH as a wrapper so that I can do setup such as get the hostname. The DTrace script itself is overly simplistic, we’re just counting read system calls and incrementing a counter. The “tick-1sec” probe will fire every second during which it will reset the counter and run a system command. System commands can be destructive, so you’ll notice that pragma is set.

The system command we’re executing simply echos the metric in Graphites format and pipes it to netcat (“nc”) which sends it to the Graphite server. The format is simple: “some.metric.name value epoch_time” My metric here will be dtrace.newton.syscall.read.entry. (Newton is my workstation.)

I start that running and then go to the following URL:


http://10.0.0.22:8888/render/?width=400&height=250&target=dtrace.newton.syscall.read.entry&from=-1hours

And this is what I see:

See how flexible it is? If I wanted to run this on 4 web servers I could fire up the script, unmodified, on all 4 servers and then simply modify the URL to change the hostname in the target from “newton” to “*” and it would graph all 4 together, without having to even log onto the Graphite server. This is why I love Graphite, its so flexible you can pretty much cram it in anywhere and get useful data in a pinch.

Word of warning: The script above is intentionally over simplistic. My point here is to illustrate the basic principles, nothing more.

7 Responses to “Using Graphite to Graph DTrace Metrics”

  1. Tom says:

    That is awesome! I’ve been looking for a RRDTool replacement for a very long time.

  2. HenrikJ says:

    I have been using something similar for some time now to plot various VFS related metrics on Sol11Exp using active checks in Zabbix …

    • benr says:

      I’d be interesting in hearing about your method!

      I’m using Zabbix pretty extensively with an active agent configuration. Does your agent call a DTrace script from zabbix_agentd.conf UserParameters or do you use zabbix_sender?

      The VFS stats I record in Zabbix are all based on KStats and sent via UserParameters.

      I love the fact that Zabbix essentially gives you graphing for free, however the facilities for interacting with the graphs is pretty rudimentary. I’ve really been unpleased with the report facilities and found that many other Zabbix users have resorted to using Jasper Reports to coolese data direct from the Zabbix database rather than deal with the poor reporting capabilities.

      • HenrikJ says:

        Hi Ben,

        I have been using both zabbix-sender and UserParameters
        but am slowly migrating towards flexible UserParameters
        in order to resuse as much as possible.

        All kstat data is collected via UserParameters but data gathered
        via dtrace is pumped in via zabbix-sender.

        I agree that reporting sticks – we have our own highchart based reporting
        tool that is fed from the zabbix db.

  3. Jim says:

    Ben, I’m a community leader on a network of developer websites. I enjoyed your blog content and was wondering if you were interested in some extra exposure on our sites. Drop me a line and I can give you all the details.

    Thanks for your time!

    Jim

  4. Anton Pavlenko says:

    There is great tool for monitoring and perf. statistics from Sun employer Dimitri Kravtchuk : Dimstat.
    It can be really easy extended by self written dtrace scripts. It works great for me.
    As I understand Graphite do the same job.

  5. Jim says:

    You mentioned that Carbon runs on a udp socket but isn’t it TCP? statsd, a nodejs based listener (https://github.com/etsy/logster) written by Etsy runs on UDP actually.