Building upon our previous Homebrew NMS installment, we will focus this installment on storing discovery data in your custom database. Every piece of data gathered by discovery applications is useful. Unfortunately, there is no single piece of software that can discover everything you wish to know.
Instead of having two (or more) sources of data, it is extremely useful to combine all data source into a central repository. A future article will discuss correlating IT data with actual IT facts, but having a common source of data is the first step.
Network discovery: what is it? This is the difficult part. Most network management systems will tell you everything you want to know about an IP address, including when it came online, what ports are open, and possibly even the operating system. This is good information, and the implementation of such a system is sure to bring bushels of new and eye-opening data to light. Unfortunately, that’s only half of the picture.
Network discovery, in my mind, is Layer 2 discovery. Layer 3-7 discovery, as described above, doesn’t give nearly enough information. Layer 2, at the MAC address level, is where the interesting stuff happens. First, to identify the physical location of a host, you must have Layer 2 data. Second, you may wish to know when a node moves. With only Layer 3 discover, you’d never know if something moved, assuming it was still connected to the same broadcast domain. Finally, Layer 2 data will also provide information about your network architecture at a lower level.
There are a few open source applications available for discovering Layer 2 data. NetDisco is one, and Nedi is the other, and both stand for "network discovery." We’re most experienced with Nedi, so today’s example will use Nedi data.
Once configured, Nedi will crawl your network using CDP, among other protocols, and gather all kinds of information. Each run is stored in a CSV file or MySQL database. For the sake of example, since we want to extract some pieces of information from Nedi for our own uses, we’ll use the CSV (comma separated values) output format.
While you may want to use Nedi’s functionality, it is extremely advantageous to slurp out a few pieces of discovered information, namely:
- MAC address to IP address associations
- MAC address to switch port information
- The first time a node (MAC address) was seen on the network
- The last time it was seen
The above information can provide all the data we’ve been preaching about, most importantly: the physical location and ability to disable network access.
So with a fully populated nodes.csv file from Nedi, we can begin to parse out the data for other uses. The format of the file is quite simple. The following are the attributes, in order, separated by two semicolons:
- Hostname from DNS
- IP address
- MAC address
- Manufacturer of the network card
- First seen timestamp
- Last seen timestamp
- Switch hostname (the switch where the node is attached)
- Switch port
- IP address of the switch
- Switch port description
What a wealth of information! We probably want to track where our nodes have been over time, but for simplicity’s sake, this example is simply going to populate a PostgreSQL database with "bridge table data."
Bridge table data, or the MAC table as it’s sometimes called, is essentially the output of ‘show mac-address dynamic’ at the Cisco command line. Shoving this data into a "host database" of our own design will allow us to correlate a machine’s host name and its physical location (switch and switch port) with our own internal notion of the node. Our central idea is that inventory tracking, ticket tracking, configuration management, and just about every other piece of IT data should be correlated with physical node data.
Simply put, this is an exercise in populating our own bridge table. The database schema is quite simple, we’re literally creating a "bridge table" in psql to store an entire enterprise worth of bridge tables:
Table "public.bridge" Column | Type | Modifiers ------------+-----------------------------+----------- switch | character varying | not null switchport | character varying | not null mac_addr | macaddr | not null source | character varying | sourcedate | timestamp without time zone | firstseen | timestamp without time zone | lastseen | timestamp without time zone | vlan | integer | Indexes: "bridge_pkey" PRIMARY KEY, btree (switch, switchport, mac_addr) "bridge_ma" btree (mac_addr) "bridge_sw" btree (switch, switchport)
Using Perl, again, and DBD::Pg, we’ll begin by defining our DB connection information:
Next, we need to open the file and begin reading it. The final block of code is quite large, because it is necessary to execute all processing steps on every line of the file. The steps are: check for data sanity, assign variables names to each attribute of the line, and then perform one of two actions. If the MAC is already known, and its location has not changed, simply update the "lastseen" field in the database. Otherwise, this is a new host entry.
Checking data sanity is quite important, since pieces of information can often be missing. The example skips over unknown data, but it would be just as easy to flag it for human perusal if that is required.
Hopefully this has been a good example of how easy it is to manipulate discovery data. Next time we’ll begin talking about how all this can be used, and what other pieces of data should be involved.