Creating an initial RRD

RRD stands for Round Robin Database, and it is what it sounds like. There are a fixed number of records in the database and once the last record has been written in the database the next update goes to the first record, and around and around it goes. In this way, your databases will never grow out of control. The only downside to this is that, obviously, you've got to know how much data you'll want to look at historically ahead of time so that when you generate your database you have enough data. You may want a days worth of info, or even months.

When creating an RRD database we'll need to specify a couple things, namely one or more Data Sources and one or more Round Robin Archives. The data source (DS) defines what type of data is accepted and some boundaries on what constitutes good data. The round robin archives (RRA) can almost be thought of as views, it defines the different ways we can store and retrieve data.

The following is an example of database creation. In this example I am setting up an RRD to monitor TempTrax temperatures.

Figure 1. TempTrax RRD Creation Command

[benr@nexus TempTrax-RRD]$ rrdtool create temptrax.rrd \ 
--start N --step 300 \
DS:probe1-temp:GAUGE:600:55:95 \ 
DS:probe2-temp:GAUGE:600:55:95 \
DS:probe3-temp:GAUGE:600:55:95 \
DS:probe4-temp:GAUGE:600:55:95 \
RRA:MIN:0.5:12:1440 \
RRA:MAX:0.5:12:1440 \ 
RRA:AVERAGE:0.5:1:1440

The tool rrdtool is called with the argument create followed by the RRD filename of the database to be created. The next two arguments specify the time at which the database starts, measured in seconds since the Epoch, and the step time measured in seconds which is the interval between database updates. In this case I'm using "N" as the start time which tells rrdtool to use "now" as the time, and my step interval is 300 seconds (5 minutes).

The "DS" lines specify our different data sources. In this case I want to monitor 4 different temperature probes which I've named sequentially. The following field ("GUAGE") species the Data Source Type (DST) which is one of GUAGE, COUNTER, DERIVE or ABSOLUTE. The Guage DST works like you'd expect and is generally the best choice. Counters continuously increase, Derives store a derivative of the last and the current value, and Absolutes store values which reset after each reading. The last 3 fields of the Data Source lines specify the minimal heartbeat and both min and max values. The minimal heartbeat is a value measured in seconds after which the values is said to be unknown (think of it as a timeout). The min and max specify a range for "good" values; if a value is outside of this range it is said to be unknown.

The "RRA" lines specify different round robin archives. These are likes views, by which the data are stored. Inside the RRD database file each RRA is stored separately, with a predefined number of records. Each time we "update" our database we are adding a Primary Data Point (PDP), which are then combined together and put into our RRA based on a Consolidation Function (CF) that determines what the actual value thats written is. In the above example the first field specifies that we're defining an RRA. The second field specifies the CF that is used, one of AVERAGE, MIN, MAX or LAST. The third field specifies the XFiles Factor (XFF), this is the percentage of PDPs that can be unknown without making the recorded value unknown. The fourth field is the number of PDPs that will make up the recorded value. The final field specifies the number of records this RRA has.

Looking again at the example above, I'm creating a new RRD named temptrax.rrd that starts "Now" and is updated (stepped) every 300 seconds (5 minutes). The RRD contains 4 different Data Sources, 1 per probe of the type GAUGE. If the Data Source isn't updated every 600 seconds (10 minutes) or if the value is not between 55 and 95, then the value is considered to be in error and is written as unknown. Then, 3 different RRA's are specified. Two of the RRA's record the MIN and MAX values using 12 PDP's allowing 50% of them to be unknown. We're storing 1440 records in these RRAs which means that because we're updating every 5 minutes (step) and we're using 12 PDPs (each update is a PDP), which means we're adding a record every hour (5 mins * 12) and we're storing 1440 records meaning that we will have 60 days (1440hrs/24hrs) worth of data in our RRA. For each of these the minimum value and the maximum value from the collected PDPs will be used as the recorded value. In the last RRA defined we're using the average of our collected PDPs, still allowing for 50% unknowns, but we're using only 1 PDP per record, which means we're storing every update. We're allowing for 1440 records which means this RRA is storing (1440/12updatesperhouse/24hrs) 5 days worth of data. In this later case because we're using a single PDP the CF isn't really important and we probably should have used LAST just for cleanliness.

Here's another breakdown of the DS and RRA arguments:

Data Source Fields: DS:DS-Name:DST:HeartBeat:Min:Max

DS

Defines a Data Source Field.

DS-Name

The name of this Data Source.

DST

Defines the Data Source Type. Can be GAUGE, COUNTER, DERIVE or ABSOLUTE.

HeartBeat

Defines the minimum heartbeat, the maximum number of seconds that can go by before a DS value is considered unknown.

Min

The minimum acceptable value. Values less than this number are considered unknown. This is optional. Specify "U" (unknown) to not set a min

Max

The maximum acceptable value. Values exceeding this number are considered unknown. This is optional. Specify "U" (unknown) to not set a max

Round Robin Archives: RRA:CF:XFF:Steps:Rows

RRA

Defines a Round Robin Archive.

CF

Consolidation Function. Can be AVERAGE, MIN, MAX, or LAST.

XFF

Defines XFiles Factor, the number of data points that can be anally probed by martians before RRD gives a crap.

Steps

Defines how many Primary Data Points (PDPs) are consolidated using the Consolidation Function (CF) to create the stored value.

Rows

Defines the number of Rows (records) stored in this RRA.