Nagios plugin?

Mark Smith smitty at gmail.com
Wed May 7 19:27:45 UTC 2008


> Database has to be checked because there are instances (testing only
> fortunately) when we could lose the db and MogileFS::Backend will keep
> trying to connect either causing an NRPE plugin timeout or and exit code of
> '82' which Nagios would read as an 'UNKNOWN'.  Mogile is doing exactly what
> it was supposed to be doing, but sometimes that doesn't work for Nagios
> plugins and I needed to find a way to "short circuit" the check for these
> cases. :)

Ahhh, interesting.  I'd still argue that if MogileFS is doing what
it's supposed to (retrying the DB) then that's fine.  MogileFS is up
after all.  And then your database check should catch it and shout.

However, to each his own!  If this works for you, awesome.  :)

> I do also monitor DB separately, but if this catches something earlier, then
> I'd want to know.
>
> Yeah, I was just using the config because I felt that's something everyone
> would have on their setup.  Just something quick that came to mind, but
> dumping in a garbage file is definitely something that can be added and much
> more secure.

I was just thinking about the case when you run Nagios on another
machine.  Here at Mozilla we have a few machines that do Nagios checks
in the different data centers and then those are sent up to the
central Nagios server which does the alerting and such.  We don't use
MogileFS, but if we did, it wouldn't be installed on the machines that
do the checks.

> It is using a new key each time.  Sorry about not commenting up that part,
> but I'm adding the hostname and a timestamp onto the key each time.  It's
> also deleting the key right after we do the comparison in the 'cleanup'
> function.

Oh, you're totally right.  Sorry about that, I missed it on my read through.  :(


-- 
Mark Smith / xb95
smitty at gmail.com


More information about the mogilefs mailing list