RRDtool is a wonderful tool for collecting and graphing data.
RRDtool is the OpenSource industry standard, high performance data logging and graphing system for time series data. RRDtool can be easily integrated in shell scripts, perl, python, ruby, lua or tcl applications.
Take a look at the website for additional information… and read on for some things that I find useful.
Data collection
Data are collected into the database and fetched from it. The collection is split into two parts: how they are read, and how they are stored.
Reading of data is specified through the description of a Data Source, or
DS
. See the docs about rrdtool create for the details, but it’s useful
to know that:
- GAUGEs are inputs that can go up and down. Like a temperature, the voltage at some pin or the amount of money in a bank account.
- COUNTER is for meters that can only increase, like e.g. the number of times that you a light turns on, the quantity of bits that enter an interface or that the sun rises in the morning.
- DERIVE can be used for the same kind of data that a GAUGE is for, but focuses on the difference with respect to the previous read instead of the absolute value. This can be useful e.g. if you want to track an increase or decrease rate for a quantity. The docs page about rrdtool create also has additional remarks about the relation between DERIVE and COUNTER, so give it a try if you’re having trouble with your COUNTERs.
- ABSOLUTE is for counters that get reset upon reading. So, each time you read the value you reset the counter, do you?
RRDtool is mostly interested into rates, so all of the above are actually translated into a rate, except for GAUGEs that are stored as-is (so that you can track things that actually have little to do with rates). If you want to graph the stock market, use GAUGE.
Times
Time handling in RRDtool is quite interesting. It is assumed that you will feed a new set of values every step, where the step is specified in seconds. The default is 300, so you’re supposed to feed a new set of values every 5 minutes, but of course you can set what you see fit.
The relevant concepts for times in RRDtool are:
- step, i.e. the length of the time range
- start, i.e. when a time range starts
- end, i.e. when a time range ends
You can set a start time when you create a database, but the real start time will be set depending on the step - in particular, as an integer multiple of the step.
It’s useful to think the line of time as a sequence of time intervals: interval 1, interval 2, …, interval N. The real start is 0, corresponding to when the epoch starts (January 1st, 1970), but time is actually a sequence of intervals and not of points.
Values stored in the database are always referred to one interval, not to a point in time.
So, what do start and end mean actually? They are used as ways to specify the intervals we are interested into. Each is first framed into one interval, then the sequence of intervals from the start’s to the end’s (included) are considered.
When we specify a point in time that separates two intervals, it is assigned to the following one. So, if the step is equal to 60 and start is 600 (separating the two intervals 540-600 and 600-660), the interval considered is 600-660. This is the same as saying that intervals are closed on the left and open on the right.
Intervals are represented with the end time of the interval. so, in the example above, if you specify start as 600, the related interval that you will get first is the one marked with 660.
Example: consider a database with a step of 60 seconds and capable of collecting up to three values. The start time has to be “quite high” to avoid incurring in some do what I mean behaviour of RRDtool.
start=600000000
items=3
step=60
rrdtool create test.rrd \
--step $step \
--start $start \
DS:testdata:GAUGE:120:U:U \
RRA:MAX:0.5:1:$items
for i in 1 2 3 ; do
time=$(($start + $i * $step))
rrdtool update test.rrd $time:$i
done
end=$(rrdtool last test.rrd)
rrdtool fetch test.rrd MAX --start=$start --end=start+180
The output is:
600000060: 1.0000000000e+00
600000120: 2.0000000000e+00
600000180: 3.0000000000e+00
600000240: -nan
which shows how both start, end and the marker for an interval are chosen according to what described above.
As an additional note, it has to be considered that real intervals might be compound of multiples of the configured step. For example, if you have a round robin archive (RRA) that aggregates 5 values with a step of 60, each data point actually refers to 300 seconds (5 minutes). When this RRA is accessed, the relevant start and stops will yield time intervals that align to a 300-seconds chunking of the time line starting from the origin of the epochs.
Getting the right data
If you want to be sure to get the right data out of an RRD database, you have to ensure some things:
- you know which round robin archive you’re looking at
- you know how many data points to ask
RRDtool will try to give you the best available data, but e.g. if you have fine grained data for the last week and you ask for data in the last ten days, you’ll hit a different RRA (if available).
To get exactly all the data in a RRA you can do as follows (assuming the
database file is test.rrd
):
-
run
rrdtool info test.rrd
to get the relevant data. You will find something like this:filename = "test.rrd" rrd_version = "0003" step = 60 last_update = 600018000 header_size = 736 ds[testdata].index = 0 ds[testdata].type = "GAUGE" ds[testdata].minimal_heartbeat = 120 ds[testdata].min = NaN ds[testdata].max = NaN ds[testdata].last_ds = "300" ds[testdata].value = 0.0000000000e+00 ds[testdata].unknown_sec = 0 rra[0].cf = "MAX" rra[0].rows = 300 rra[0].cur_row = 157 rra[0].pdp_per_row = 1 rra[0].xff = 5.0000000000e-01 rra[0].cdp_prep[0].value = NaN rra[0].cdp_prep[0].unknown_datapoints = 0 rra[1].cf = "MAX" rra[1].rows = 300 rra[1].cur_row = 66 rra[1].pdp_per_row = 20 rra[1].xff = 5.0000000000e-01 rra[1].cdp_prep[0].value = -inf rra[1].cdp_prep[0].unknown_datapoints = 0
- detect the RRA - there might be many in a database, so pick your
favourite. We will assume that you want to focus on
rra[1]
in the example above; -
identify the following basic variables:
step
last_update
pdp_per_row
(rra[1].pdp_per_row
in the example)rows
(rra[1].rows
in the example)
- calculate the RRA interval length as
superstep = step * pdp_per_row
- calculate the end time of the last interval with meaningful data as
real_end = last_update % superstep
(%
representing the modulus function) - consider
start = real_end - superstep * rows + 1
andend = real_end - 1
. The addition/subtraction of one second is to be sure to fall inside an interval instead of being at one border, just to avoid surprises (this is actually needed for end only)
You can then consider start
, end
and superstep
for usage in
rrdtool fetch (respectively for --start
, --end
and --resolution
)
and in rrdtool graph (respectively for --start
, --end
and --step
).
The above is implemented in the following Perl program
get-full-interval.pl
:
#!/usr/bin/env perl
use strict;
use warnings;
use English qw< -no_match_vars >;
use List::Util qw< reduce >;
use Data::Dumper;
use RRDs;
$OUTPUT_AUTOFLUSH = 1;
my ($db, $rra_id) = @ARGV;
my $info = rrd_info($db);
my $step = $info->{step};
my $last = $info->{last_update};
my $rra = $info->{rra}[$rra_id];
my $superstep = $step * $rra->{pdp_per_row};
my $real_end = $last - ($last % $superstep);
my $end = $real_end - 1;
my $start = $real_end - ($superstep * $rra->{rows}) + 1;
print "$start $end $superstep $rra->{rows}\n";
sub rrd_info {
my ($db) = @_;
my $raw = RRDs::info($db);
my %retval;
while (my ($key, $value) = each %$raw) {
my $ref = path_to_pointer(\%retval, name_to_path($key));
$$ref = $value;
}
return \%retval;
} ## end sub rrd_info
sub name_to_path {
my ($name) = @_;
return map {
if (my ($name, $id) = m{^(.+?)\[(.+)\]$}mxs) {
($name, ($name =~ m{^(?:rra|cdp_prep)$}mxs) ? [$id] : $id);
}
else {
$_;
}
} split /\./, $name;
} ## end sub name_to_path
sub path_to_pointer { # see http://www.perlmonks.org/?node_id=443584
return reduce(sub { ref($b) ? \($$a->[$b->[0]]) : \($$a->{$b}) },
\shift, @_);
}
Call this program as:
$ get-full-interval.pl test.rrd 1
where the first parameter is the name of the RRD database and the second parametrs is the identifier of the RRA you are interested into. The program will output, in order, the following parameters:
- value for
--start
- value for
--end
- the length of the interval (to be used as
--step
or--resolution
where these parameters make sense) - the number of data points you will get (useful for setting the right
--width
if you want to produce a graph)
Graphing a whole database
The following program produces a graph for each variable and each RRA you have in your database, according to the hints provided in the previous section:
#!/bin/bash
db=$1
root=$(basename "$db" .rrd)
variables=$(rrdtool info "$db" | sed -n 's/^ds\[\(.*\)\]\.index.*/\1/p')
rrdtool info "$db" \
| sed -n 's/^rra\[\(.*\)\]\.cf.*"\(.*\)"$/\1 \2/p' \
| while read rra cf ; do
./get-full-interval.pl "$db" $rra | (
read start end step rows
for variable in $variables ; do
rrdtool graph "$root-$variable-$rra-$cf.png" \
--start $start \
--end $end \
--step $step \
--width $rows \
--disable-rrdtool-tag \
"DEF:v=$db:$variable:$cf" \
LINE1:v#000
done
)
done
Of course this is one graph per variable without any fancy bell or whistle… start from rrdtool graph to learn all the masters’ tricks!