Sometimes our mirrorlist servers stop processing requests. This isn't a great problem as haproxy just stops sending them things, but we would like to know so we can fix them.
So this could just monitor haproxy on say proxy01 and look for mirrorlists dropping off, or perhaps it could use the same url that haproxy uses to see if a mirrorlist is 'alive' and just directly check them all.
nagios-nrpe check script (check_mirrorlist.py) ready for testing on proxy01.stg (placed under /tmp/)
attachment check_mirrorlist.py
fixed and tested, with help from smooge, on proxy01.stg and proxy01
Attaching for review.
next: nagios changes and then committing to ansible git
changes are ready for commit and push.
below is diff against "touched" nagios-role files, and new added files
{{{ diff --git a/roles/nagios/client/tasks/main.yml b/roles/nagios/client/tasks/main.yml index 7d1651d..60d38ff 100644 --- a/roles/nagios/client/tasks/main.yml +++ b/roles/nagios/client/tasks/main.yml @@ -41,6 +41,7 @@ copy: src="scripts/{{ item }}" dest="{{ libdir }}/nagios/plugins/{{ item }}" mode=0755 owner=nagios group=nagios with_items: - check_haproxy_conns.py + - check_haproxy_mirrorlist.py - check_postfix_queue - check_raid.py - check_lock @@ -184,6 +185,7 @@ template: src={{ item }}.j2 dest=/etc/nrpe.d/{{ item }} with_items: - check_happroxy_conns.cfg + - check_happroxy_mirrorlist.cfg - check_varnish_proc.cfg when: inventory_hostname.startswith('proxy') notify: diff --git a/roles/nagios/server/files/nrpe.cfg b/roles/nagios/server/files/nrpe.cfg index 752bca5..07c4593 100644 --- a/roles/nagios/server/files/nrpe.cfg +++ b/roles/nagios/server/files/nrpe.cfg @@ -237,6 +237,7 @@ command[check_fedmsg_tweet_proc]=/usr/lib64/nagios/plugins/check_procs -c 1:1 -C command[check_fedmsg_masher_proc]=/usr/lib64/nagios/plugins/check_procs -c 1:1 -C 'fedmsg-hub' -u apache command[check_supybot_fedmsg_plugin]=/usr/lib64/nagios/plugins/check_supybot_plugin -t fedmsg command[check_haproxy_conns]=/usr/lib64/nagios/plugins/check_haproxy_conns.py +command[check_haproxy_mirrorlist]=/usr/lib64/nagios/plugins/check_haproxy_mirrorlist.py command[check_redis_proc]=/usr/lib64/nagios/plugins/check_procs -c 1:1 -C 'redis-server' -u redis command[check_autocloud_proc]=/usr/lib64/nagios/plugins/check_procs -c 1:1 -C 'python' -a 'autocloud_job.py' -u root command[check_openvpn_link]=/usr/lib64/nagios/plugins/check_ping -H 192.168.1.41 -w 375.0,20% -c 500,60%
}}}
{{{
Changed pushed and tested on noc01 for proxy01. tested also with shutting one of the mirrorlist servers. thanks to smooge and nirik
Also, added the check to proxy04
Login to comment on this ticket.