Your NMS: Time to Go Homebrew?
Why are there so many network management related tools available? In both the corporate and open source space there's no shortage of options. And the most common complaint about them is that it takes too much time to configure the tool to suit a site's needs. There's a reason for that, and you're probably better off just building your own, or at least extending the functionality of your chosen poison via external applications.
In coming articles we'll explain the tools needed to build your own network discovery and management tools, namely Perl, a little SNMP, and a database. We'll also provide examples of the scripts to gather information, so that you may reinvent, correct, or supplement the popular tools and integrate them into your environment properly.
Our goal is to know what port everything on our network is connected to, and that we can identify each node. With this knowledge, you can automatically monitor nodes, track open ports on them, generate reports, detect unauthorized devices, detect vulnerabilities—and track them to a person—and just about anything else you can dream of.
Building your own NMS, or starting out with the mindset that OpenNMS, Netdisco, Nedi, Nagios, and similar network management tools will require hacking, is the best approach. The reason everyone complains about the "enterprise" solutions, like HP OpenView, being complex, is that they must be complex to meet such diverse needs. Unfortunately, that can lead to extremely large systems that make seemingly trivial tasks require tons of daunting steps.
We aren't really suggesting that you build your own mapping engine, or completely rewrite Layer 2 network discovery routines. Just that sometimes, more often than not, it's necessary to expand them to meet a business's needs.
The first point is that there aren't many applications that get Layer 2 network discovery correct. Nedi, for example, is extremely close, but change one thing in your network (like implement PortChannel uplinks, grr) and suddenly it will incorrectly report hundreds of MAC addresses on a single port. Whatever the source, you will have errors. Whatever the source, you (should) want to store the data in a central database, so that other applications can benefit from the information.
The information gleaned from a network about a node is the cornerstone to true automation. If your site restricts DHCP to known hosts only, then you're probably manually editing DHCP configurations for each device that wants to connect. Instead, why not set all switch ports to a default VLAN, where the router transparently redirects users' Web traffic to a registration page? After gathering the information it needs, the VLAN can be set automatically.
That's just one example; I've got thousands. How about automatically monitoring new services that come online, and then associating all trouble tickets regarding that service properly, so that the node has a searchable history of incidents? Then automatically seeding a configuration management system, like Puppet or Cfengine with the node's properties, to automate Unix configuration management automation. Then finally linking this information with a change management database would be the ultimate monitoring and management grail. Phew.
What is an NMS?
Network Management Systems are muddy waters. People call Nagios, a host monitoring system, a type of NMS. HP OpenView, which can automatically discover Layer 3 topology and create pretty maps is also an NMS. The expensive software that Cisco sells to manage hundreds of switches at once is obviously an NMS. Both server monitoring and network infrastructure monitoring usually fall under the NMS umbrella.
Perhaps a better approach is to define a few problems, or wishes:
- We want to know what's on the network, and track some data about the nodes
- We want a map of Layers 2 and 3 topology
- We'd like to manage the configuration of our network gear in a sane fashion
- We'd like to monitor nodes, and the services they provide via active polling and receiving SNMP traps
- We'd like pretty charts of availability
Some straightforward goals, right? Try finding an application that can do the above items for your site without major hacking.
The first two items, network discovery and visualization, are pretty standard requirements that go completely unfulfilled. OpenNMS can fulfill the node monitoring and SNMP trap-receiving requirement, but active monitoring is best done with something like Nagios. Many applications create pretty charts quite well, but the data they use may not be what we'd like. Managing network equipment configuration files, and updating them in an automated fashion, is almost always done with homebrew scripts.
Most sites use a few of these applications, and perhaps others, but end up cobbling the Big Picture together with homegrown scripts. Therefore, in this series dedicated to addressing the problems of the "NMS;" we will try to provide enough insight to get you started down the path of NMS-bliss.
Coming up, we'll talk about some important Perl modules that can be used to gather network information (to verify or correct your favorite NMS's view of the world), and store it in a database. Next, we will get more specific, and give examples of exploiting Nedi's discovery data. Automatically configuring network gear is the next logical step, which will be followed by whimsical musings about more esoteric uses of an NMS, as was described above.
Look for the follow-up articles to learn how crazy (in the wow-automation-beyond-belief sense) your network can get.
Next Installment: Homebrew NMS: Put It Together with Perl and Net::SNMP