Nagios and JSON

I’ll assume you already know about Nagios, or you have moved up to Icinga for your server, storage and Network monitoring and I’ll assume you have a working Nagios monitoring system and implemented a number of servers, shell scripts and maybe even had a few windows scripts of various kinds.

You might even have your web pages monitored to see that the HTTP requests return pages and so forth, however under a number of different conditions the ability of your web site to access the under lying resources may be compromised even if the underlying resources are all functional according to your monitoring system. This can happen during config changes or software upgrades or numerous other conditions, so to reduce the number of options and get down to the root cause of a failure, it would be handy to include a “Monitoring Service Interface” in your code to have it tell you all is “OK”.

A “Monitoring Service Interface” using JSON

One handy way to implement this interface in your web site application is to have a  page perform a series of checks and return a JSON representation of the status so that it can be processed by a script and returned to Nagios.

URL CALL

http://www.site.com/page.jsn

Sample Output

{“json”:true,”HttpRequestId”:”UUuHg8CoMl0AAAyMAAAAAXoi”,”data”:{“Description”:”Connected to mydb database OK”,”Status”:0},”success”:true}

Processing the JSON from a Linux Host

#!/bin/sh
# JSON Nagios Interface.
# Usage:
#
# Parameter 1: an endpoint in the Nagios controller.
# Valid values might be:
#       checkservice
#       checkdbaccess
#
# Parameter 2: a server, e.g. www.server.com.au
#
#
TEST_NAME="${1:-checkpageserver}"
URL="http://${2:-www.server.com.au}/home/nagios/${TEST_NAME}.jsn"

LOGDIR='/logs/nrpe/web-sites'
LOGFILE_PREFIX="${TEST_NAME}"

# Set the timeout in seconds:
TIMEOUT=5

# Standard Nagios exit codes
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3
STATE_DEPENDENT=4

if [ ! -d "${LOGDIR}" ]
then
mkdir -p "${LOGDIR}"
if [ ! -d "${LOGDIR}" ]
then
LOGDIR="/tmp"
fi
fi

LOGFILE="${LOGDIR}/${LOGFILE_PREFIX}.`date +%y%m%d`.log"
WGETFILE="/tmp/wget-servicecheck-$$.log"

wget --timeout=${TIMEOUT} --tries=1 -O - ${URL} > "${WGETFILE}" 2>&1
wget_status=$?

if [ $wget_status -ne 0 ]
then
echo "Problem connecting to ${URL}. Check ${LOGFILE} for details."
exit ${STATE_CRITICAL}
fi
#
#
#
if [ ! -s ${WGETFILE} ]
then
echo "No response from $URL"
exit ${STATE_CRITICAL}
fi
#
# Check for nothing found.
#
RESULTS_LINE="`grep 'Description.*Status' ${WGETFILE}`"
if [ -z "${RESULTS_LINE}" ]
then
echo "No status from $URL"
exit ${STATE_CRITICAL}
fi
#
# This function parses the JSON output
#
parse_file()
{
# Parse some JSON, very basic method.
awk -vsearch_string="${1}" '/Description.*Status/ {
found_it = ""
test_comment = ""
no_elements = split($0, json_elements, "[,{}:\"]" )
for (i = 0; i < no_elements; i++)
{
if (found_it == "y")
{
if (json_elements[i] != "")
{
test_comment = json_elements[i]
break
}
}
else if (json_elements[i] == search_string)
{
found_it = "y"
}
}
printf("%s",test_comment)
}' ${WGETFILE}
}
# Parse for specific strings in the JSON
U_COMMENT=`parse_file "Description"`
U_STATUS=`parse_file "Status"`

# Validate the responses.
if [ -z "${U_COMMENT}" ]
then
U_COMMENT="No description returned."
fi

case "$U_STATUS" in
0)
# Ignore, as the status is valid.
;;
1)
# Ignore, as the status is valid.
;;
2)
# Ignore, as the status is valid.
;;
3)
# Ignore, as the status is valid.
;;
4)
# Ignore, as the status is valid.
;;
*)
# Invalid status.
U_COMMENT="Invalid Status was returned: ${U_STATUS}."
U_STATUS=${STATE_CRITICAL}
;;
esac

echo ${U_COMMENT}
cat $WGETFILE >> $LOGFILE
rm -f $WGETFILE
exit ${U_STATUS}

References

Some handy sites with JSON related info:

The script using wget to call the service end point which returns JSON code, it then processes it to extract the description text and status code, then returns that as a normal NRPE client application would but on the localhost.

Enjoy!

Advertisements