Visualizing Cassandra nodetool cfhistograms output using a histogram
Apache Cassandra includes a lot of functionality and tools which provide good visibility into your cluster health and performance.
A lot of this performance and health related metrics are exposed over the JMX interface and through the nodetool command line tool. nodetool is a simple wrapper around JMX interface which allows you to access some of the most commonly used attributes through a simple command line interface.
Another feature which was added recently and is available in Cassandra 1.2 is a feature called request tracing. Request tracing allows you see exactly what happens during a query executing and exactly how long each step takes.
This data is very granular and includes everything from how much time it takes to parse the CQL query to how much time it takes to talk to other nodes in the cluster and read data from memory and / or disk.
This functionality is very powerful, but it’s only available in Cassandra 1.2.
Some of the Cassandra clusters we operate here at Rackspace, more specifically on the Cloud Monitoring team, don’t run Cassandra 1.2 yet. Because of that, I’m going to focus on another very useful feature which is available in the older versions of Cassandra today.
This feature is cfhistograms
command exposed by the nodetool
utility.
cfhistograms nodetool command
cfhistograms command prints statistic histograms for a particular column family. The output includes the following information:
- Distribution of the write latency
- Distribution of the read latency
- Distribution of number of sstables accessed during a read
- Distribution of the row size
- Distribution of number of columns in a row
This information is very useful, but the problem is that the default output is very convoluted and hard to read. If you Google around, you can find some good posts which explain how to interpret this output (e.g. Cassandra 0.7.x - Understanding the output of nodetool cfhistograms), but nevertheless interpreting the raw command line output is still time consuming and cumbersome.
Visualizing cfhistograms output
Around a year ago I was debugging a performance issues in one of our
clusters so I decided to write simple Python script which visualizes the
cfhistograms
output using a histogram.
This script is nothing fancy, but it does it’s job. In the background it uses a couple of lines of Python and matplotlib to convert the raw text output into nice looking histograms.
Usage
Script is available as a gist on Github.
1. Download and chmod the script
2. Install the dependencies
Optionally, if you want nicer graphs you can also install prettyplotlib
library.
3. Run the script
For example:
All this script does it reads data from the input file, processes it and writes 5 different histogram files to the output directory.
Conclusion
I hope you find it useful and this script will allow you to more easily
interpret the output of nodetool cfhistograms
command.
In the future, I will try to write more about how we used the output of this command in practice to identify and issue and misconfiguration in one of our clusters.
Edit 1 (September 13th, 2013): Modify script to be more robust and ignore any
additional data in the input before the actual header.
Edit 2 (September 13th, 2013): Modify script to work without an X server.
Edit 3 (September 29th, 2013): Modify script to use prettyplotlib library
(if available)
Edit 4 (September 30th, 2013): Make script more robust - don’t explode and
ignore columns where all the values are zero