How-to use cacti to monitor Cassandra

I’d like to share my experiences to setup monitor for Cassandra in cacti.

The template for Cassandra is based on mysql-cacti-templates and Cacti Templates for Cassandra. The setup steps are listed in the README file of the package.

The original template is based on Cassandra 0.6.8 while I need to monitor Cassandra 0.8. So there are some problems:

Fixed issues:
1. function array_fill_keys bug in ss_get_cassandra_stats.php
My php is 5.1.6 while this function array_fill_keys is only available after 5.2.0. So there is a replacement in this file. But there is a deadly bug. It should use “$v” as key, not “$k”. Finding this bug took me a lot of time, while it’s easy to fix:

if( !function_exists('array_fill_keys') ) {
  function array_fill_keys($array, $value) {
    $r = array();
    foreach( $array as $k => $v ) {
      //$r[$k] = $value;
      $r[$v] = $value;          //alex
    }
    return $r;
  }
}

2. function to_shortname issue in ss_get_cassandra_stats.php
Cassandra’s JMX property name/items are changed a lot. This makes the to_shortname function stop work for many items. It’s too tedious and not necessary to add all of them. So I added 3 lines to ignore them, by setting all unknown items to ‘xx’.

  //alex, to ignore not defined keys in cassandra_keys, map all of them to xx
  if(!isset($cassandra_keys[$pfx ."_$name"]) ) {
    return 'xx';
  }
  return $cassandra_keys[$pfx ."_$name"];

Some of them are just changed name, so I need to update keys in $cassandra_keys. Here is the updated file ss_get_cassandra_stats.php

3. function get_stats_cache bug in ss_get_cassandra_stats.php
Different hosts in one cluster should not shared the same cache file. So I added the host name to the cache file.

  //list($fp, $content) = check_cache($cache_dir, $poll_time, 'cassandra_'. $options['cluster'], $options);     #alex, should not shared between hosts
  list($fp, $content) = check_cache($cache_dir, $poll_time, 'cassandra_'. $options['cluster'] . '_' . $options['host'] . '_stats', $options);

4. Cassandra’s JMX domain name changed in ss_get_cassandra_stats.php
‘org.apache.cassandra.service’ changed to ‘org.apache.cassandra.db’
‘org.apache.cassandra.concurrent’ changed to ‘org.apache.cassandra.request’

5. Finally, there are too many graphs in the templates
It can add graphs for each keyspace or column family. Also many metrics don’t exist anymore. That will be too much for our cacti system as we get data every 1 minute. So I deleted many graphs from the template, this is the updated file cacti_host_template_x_cassandra_server_ht.xml.

Besides these solved issues, there are some remained :
1. Performance: after I added a 4-node Cassandra cluster to our cacti system, the poller time jumped from 30 seconds to 50 seconds. This make it very near to 1 minute.
2. Cassandra’s JMX property name/items changes a lot. It’s not surprise that it will continue changing in coming versions. We need to take lots of efforts to maintenance this templates.
3. Some metrics cannot found in JMX while it’s easy to get by Cassandra’s nodetool, and it’s almost not possible to add metrics that not in JMX.

So I may use home-grown scripts to create templates later, like what we did for mongoDB.

Advertisements

About Alex Zeng
I would be very happy if this blog can help you. I appreciate every honest comments. Please forgive me if I'm too busy to reply your comments in time.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: