cloud-green / zookeeper-snap-charm Goto Github PK
View Code? Open in Web Editor NEWThis project forked from kngu9/zookeeper-snap-charm
License: Apache License 2.0
This project forked from kngu9/zookeeper-snap-charm
License: Apache License 2.0
The Nagios thresholds for zk_watch_count are -w 100 -c 500. Currently the value is 173. This means that if the check rises above 500, or goes critical for some other reason, such as an NRPE timeout or network blip, due to the way Nagios/PagerDuty integration works, the PD incident won't be resolved when the check recovers to warning. Ideally we'd either have no warning state at all or else just set -w and -c to the same value (but see a related issue I'm about to file).
Before I realized I couldn't customize this (see #20), I was playing around with check_zookeeper.py to see if our usual -w = -c trick would work. It fails:
ubuntu@juju-6650f3-prod-event-bus-ua-12:~$ /usr/local/lib/nagios/plugins/check_zookeeper.py -o nagios -s 10.15.122.11:2181 --key zk_watch_count -c 100 -w 99
Critical "zk_watch_count" 10.15.122.11:2181!|10.15.122.11:2181=173;99;100
ubuntu@juju-6650f3-prod-event-bus-ua-12:~$ /usr/local/lib/nagios/plugins/check_zookeeper.py -o nagios -s 10.15.122.11:2181 --key zk_watch_count -c 100 -w 100
Ok "zk_watch_count"!|10.15.122.11:2181=173;100;100
ubuntu@juju-6650f3-prod-event-bus-ua-12:~$ _
Ideally it would return critical here. If we could also omit -w entirely, that'd be rad.
The Nagios thresholds for zk_watch_count are -w 100 -c 500. Currently the value is 173. This means that if the check rises above 500, or goes critical for some other reason, such as an NRPE timeout or network blip, due to the way Nagios/PagerDuty integration works, the PD incident won't be resolved when the check recovers to warning. Ideally we'd either have no warning state at all or else just set -w and -c to the same value (but see a related issue I'm about to file).
To make zookeeper-telegraf plugin run without manual changes it would need the charm to create a config file with the same ip used in https://github.com/cloud-green/zookeeper-snap-charm/blob/master/charm/zookeeper/templates/zoo.cfg#L29 when a telegraf relation is added.
The file should be in /etc/telegraf/telegraf.d/extra_plugins.conf with the below contents:
[[inputs.zookeeper]]
servers = ["{{ client_bind_addr }}:2181"]
Another approach would be to make the snap to bind zookeeper to localhost or to all interfaces. That way the plugin's default configuration would work.
Once this is implemented the nagios check can be removed as there's a prometheus alert replacing it.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.