Enhancement #195
closed
Zabbix will replace Cacti
Added by Marc Dequènes over 13 years ago.
Updated over 10 years ago.
Category:
Service :: Supervision
Description
Since a few weeks, we are testing Zabbix. Here are notes, progress information, and todo items. It may replace Cacti in the future.
Missing items:
- network in/out discard/errors
- CPU per core
- LDAP
- PostgreSQL
- disk space
- % Done changed from 40 to 60
After an electrical problem leading to a database disaster, database not yet saved anywhere, i had to rework everything.
This is done and running, with more stats than previously.
Backup still needs to be done, but i'm counting on Korutopi, soon to be operational, to handle Daneel's safeguard.
Missing hosts:
Missing stats:
- CPU/Load on Yomiko and Maru/Moro
- CPU per core on all hosts
- PostGreSQL on Orfeo
- more Mail on Orfeo (see adm_mail_stats script for a start)
- temperature on all hosts (lm-sensors / ACPI, smartd…)
- temperature on devices?
anything else ?
Cacti should then die soon, after all remaining useful stats found there are migrated in Zabbix. The MySQL database on Elwing will then die afterwards too (what a relief!).
- Subject changed from Trying Zabbix to Zabbix will replace Cacti
- Priority changed from Low to Normal
From Cacti: pgsql_stats.php* has been saved in /root, nothing else matters
Purged Cacti (and MySQL btw).
- % Done changed from 60 to 70
Updated for changes in #279
Added Load on Maru and Moro.
Moved todo items to #290 (non-essential and non-regression compared to Cacti stats).
Remaining todo:
- Gwaihir
- PG on Orfeo
- Mail stats
Fixed disk read/write broken stats: Zabbix does not handle bps stats anymore (since beginning of the year, maybe it was wrong data), so i added an item to fetch the disk sector size and calculated the bps using de sectors/s data.
- cleanup of the main OS GNU/Linux template:
- removed useless items/triggers
- added system.cpu.num and adapted the load trigger to use it
- change many triggers severities
- added port and proc.num checks on all templates where it was missing
and probably a few other minor things
- increased StartPollers
- decrased StartTrappers
- increased Timeout
- Status changed from In Progress to Resolved
- % Done changed from 70 to 90
Zabbix already replaced Cacti since a while. Most of the listed items were done, moving the rest to #290.
Also available in: Atom
PDF