Cacti host run out of capacity

Our cacti host in production run out of  capacity recently. We use cacti to create graphs for MySQL, including Innodb and Memory engine db, and MongoDB. The key benefits of cacti is that it’s easy for users to understand, and our developers can easily check DB performance metrics by themselves. It’s also easy for DBA to setup them because it didn’t need to setup/maintain agents at each DB host.

The benefit of easy-setup is also causes problem: all poller actions have to be done at cacti server. We run into performance problem about a year ago: the poller cannot finish all poll items in 1 minute. I replaced the php poller with spine which is is written in native C and more powerful. It started to work fine without problem. As we have more and more hosts added, I adjusted the “Maximum Concurrent Poller Processes” and “Maximum Threads per Process” at the same steps, and cacti kept hold its position to finish pollers in 1 minute.

At the same time, the hosts load kept increase, and reached 45 recently on this 24 virtual CPU(2 cores) physical host. It starts to run timeout for some hosts recently. I tried to adjust “Maximum Concurrent Poller Processes” and “Maximum Threads per Process”, but it didn’t help. The host load 45 is already much more than its 24 CPU number. It’s already overloaded. We can upgrade cacti host to more powerful host to scale-up, but it didn’t solve the scale-out problem. It’ll run into the same problem sooner or later.

At this time, cacti handles ~800 hosts with 18k datasources and 18k RRDs in 1 minute. The  “Maximum Concurrent Poller Processes”  is 3 and “Maximum Threads per Process” is 60. It finishes each round in 57 seconds in average. The serever CPU mode is ” Intel(R) Xeon(R) CPU   X5670  @ 2.93GHz”, 2 cores with 24 VCPU.

Although cacti has “Distributed and remote polling” in it’s road map, but the release date is unknown. That’ll help solve the problem of putting all load on a standalone host. We decided to stop adding more hosts to cacti, and pursuit the other solution.

About Alex Zeng
I would be very happy if this blog can help you. I appreciate every honest comments. Please forgive me if I'm too busy to reply your comments in time.

Leave a comment