This guide sets up a solution using statsd to collect detailed statistics from a system. For an introduction to the concept, please read "measurement with statsd". The process is to send data from within the application to statsd server. Statsd server sends data to Graphite for storage and visualization.
It is better to keep the statsd server as close as possible to hosts sending data. This way you won't have a lot of UDP packets traveling a far distance to reduce the chance of packet loss. In case of different data centers, you could setup a statsd server in each data center, so hosts in each data center would send their statsd packets to the server running in the same data center. Then all statsd servers from different data centers flush their data to a central collector server (Carbon/Graphite). This transmission is over TCP and happens once every second for each statsd server.
We are going to use bucky, a statsd server written in Python, However there are many servers implemented in different languages. At work we use Perl extensively and we were more comfortable to use a Perl based solution, so we chose Net::Statsd::Server which is handling quite amount of traffic very well.
Bucky could be used to collect data through different protocols, for this guide we would disable others and use only statsd.
Install bucky using pip Python package manager:
sudo pip install bucky
It could be started by specifying command line options, or read its configurations from a file. By default if bucky could not connect to Carbon (data collection service from Graphite) it will retry for maximum of times and then quits. We'll make sure that bucky will keep trying every 5 seconds without a limit. Write these configs in bucky.conf file:
debug = False full_trace = False log_level = "INFO" metricsd_enabled = False collectd_enabled = False statsd_ip = "" statsd_port = 8125 statsd_enabled = True statsd_flush_time = 1.0 graphite_ip = "127.0.0.1" graphite_port = 2003 graphite_max_reconnects = 0 graphite_reconnect_delay = 5 graphite_pickle_enabled = False graphite_pickle_buffer_size = 500
Now start bucky:
bucky bucky.conf > bucky.log 2>&1 &
Of course this is a quick start, you may want to start it as a system daemon on production. Bucky source distribution comes with scripts for systemd and runit. Make sure that Carbon is running so we could store the data.
Statsd packets are text over UDP, each line contains a data bucket. Each data bucket has a name, a value and a type (and a sampling factor). Most client libraries provide a separate method for each type, so all you need to specify is the bucket name and value.
Supported types are:
- timing: measuring the time required for an operation in microseconds units. like the time it takes to execute a database transaction.
- counter: counting the number of any event or operation. counters could be incremented or decremented. like number of hits for a web page or number of occurrences of an error.
- gauge: arbitrary values, like CPU usage in percentage or memory usage in bytes.
- set: count the number of unique values during a sample period. like number of different IP addresses that requested a URL in the given period.
Any programming language with a statsd client library could be used to send data. Here we use a sample code in Python using statsd module.
#!/usr/bin/env python import time import random import statsd sc = statsd.StatsClient('localhost', 8125) # on user login action sess_time_start_mics = time.time() * 1000 * 1000 sc.incr('user.login') # profile any operation, like a db transaction db_time_start_mics = time.time() * 1000 * 1000 # sleep to fake a db transaction time.sleep(random.random() / 5) sc.timing('db.get_user_data', time.time() * 1000 * 1000 - db_time_start_mics) # ... later on user logout action time.sleep(random.random() / 5) sc.timing('user.session', time.time() * 1000 * 1000 - sess_time_start_mics) sc.incr('user.logout')
Data is sent from application to bucky, then flushed to Carbon and stored in whisper files. Check Carbon storage path to find new files and directories in /opt/graphite/storage/whisper. If you have setup Graphite you could find graphs for login, logout, under "user" node, and "user.session" and "db.get_user_data" nodes under timers.