Sound the Alarm With SNIPS
System and Network Integrated Polling Software can monitor over 25 network services on up to 2000 devices and send outage alerts to a Web page, e-mail, or your pager. Carla Schroder covers the essentials of this useful Unix tool.
SNIPS is a comprehensive utility for monitoring 25+ network services and elements, such as DNS, BGP (border gateway protocol), uninterruptible/standby power supply, NTP, RPC, to name a few; and for sounding an alarm when something goes wrong. It runs on any Unix, and will monitor services on any network. Having Perl skills is necessary to use this effectively- it is written primarily in Perl, both scripts and binaries, with a few C libraries thrown in for flavor.
This makes SNIPS very configurable and customizable. Even ports are configurable, if you don't like the defaults, change them in the configuration files. Monitoring can be done in both real-time, and by reviewing logs. It logs only exceptions, for fast and easy pinpointing of trouble spots. Real-time monitoring is via Web browser (SnipsWeb), or an ncurses-based text view (snipstv). snipstv is the simplest interface with the least number of options; however, it comes in mighty handy when you don't want to run an X session.
It also has a Tcl/tk based monitor, tkSnips, which requires both client and server. This is powered by ndaemon, a simple daemon with no access control, so its default port - 5005- must be blocked to unauthorized users.
The Web interface refreshes by default at one-minute intervals. It is completely customizable- change the page layout, colors, even craft custom warning symbols and sound effects.
Alarms are sent to a pager or email. Alarms are also highly configurable. To reduce false alarms, an alarm is triggered by escalating levels of severity. The four severity levels are Info, Warning, Error, and Critical. For example, portmon monitors tcp sockets: POP3, httpd, IMAP, WWW, and such. By default, any connection failure is considered Critical, triggering an alarm. The event is logged, and displayed on whatever real-time monitoring interface you are using. Notifications will continue to be sent hourly, or whatever interval you choose, until the problem is addressed. Logging in to the interface of your choice- browser, etc.- brings up a screen which will give more information about the alarming event. The information screens provide a place for admin comments, such as "Web server 01 down until 7am for upgrade."
Theoretically, with the proper gadgets, the overworked admin could respond to alarms remotely- say, from a fishing boat, or serene forest trail. Even to the point of resetting all those "Critical" events to "Warning."
The theory behind SNIPS is rather simple, and ingenious. The various monitors simply poll devices at configured intervals. Thresholds are set by the user. Events are logged and data written only when thresholds are exceeded. Severity ranges are also user-configured- if you don't care, hey, SNIPS doesn't either!
SNIPS is the child of NOCOL, which was developed in the early 90's. The available monitoring utilities of the time were complex and required brawny hardware. NOCOL was designed to be lean and low-overhead. Separating the functions- monitoring, logging, and reporting- meant it could be distributed over a network, spreading the load. Limiting logging to exceptions-only was a stroke of brilliance, keeping logs very small, yet still capturing the important data. Only changes are recorded- for example, from Warning to Critical, or Critical to Error. Zero redundancy, yet still a complete picture.
The primary author of SNIPS, Vikas Aggarwal, says it will scale up to monitoring 2000 devices. (Other contributors are noted in the source files.) It runs on low-end hardware, and allows multiple users to view the same data simultaneously. Graphing is done via RRDTool- Round-Robin Data Tool. RRDTool stores data very compactly, preventing log and data files from devouring the system.
The best documentation is on the SNIPS Web site (see Resources, below). A copy of the Web site is included in the tarball, but it may not be up-to-date. The installation instructions are detailed and clear, I'll go over a few potential gotchas.
Download and unpack the tarball, and do the configure-make-make install dance. SNIPS has a nice installation, the interactive Configure script asks useful questions, rather than requiring the user to wade through a bunch of configuration flags. The one useful thing it does not do is allow customizing the location of the /data and /logs directories. All files are placed under /usr/local/snips. Files of constantly varying size really should go in /var. And it's often a wise precaution to squirrel log files away where an intruder cannot get to them. No big deal, just change things after installation.
/usr/local/snips/etc/snips.conf is the global configuration file. /usr/local/snips/etc/snipsperl.conf is the master configuration file for the Perl monitors.
It does ask where to put the man pages- be sure to set this correctly, or your system may not find them. Look for the man directory, on Red Hat Linux 8 it's in /usr/share/man.
The installation doc suggests saving the output from make in a file:
# make >& make.out
Do the same thing with make install, with a different filename:
# make install >& make1.out
Or, a really slick trick is to output both to a file, and the screen:
# make | tee make.out
Then you see the output as it is generated, and the helpful messages at the end:
Now run make install
There's a lot of useful information in there, be sure to review it carefully. For example:
"Installed basic web files under
/usr/local/snips/web/html and /usr/local/snips/web/cgi
Move into desired web location.
Edit the following files under /usr/local/snips/etc
snipsweb-confg updates webusers
Also chown httpd webcookies updates"
It's a quick way to review file permissions and ownership as well:
"install -c -m 751 tpmon /usr/local/snips/bin/"
There are two ways to control when the monitors run: from init scripts in /usr/local/snips/init.d, or by using crontab. Mix n match as you like. I like crontab, making the computer do the work is a good thing. All the monitors you want to run must be listed in the keepalive_monitors.pl script.
A Perl interface for developing additional modules using the snipslib.pl library is included, see the installation docs for details.
If You Want To Spend Money
NetVigil is the commercial version of SNIPS. It has more features, technical support, and runs on Windows as well as Unix.