Recently, we observed someone noting that Xen doesn’t provide the same physical-to-virtual (p2v) conversion capabilities that VMware does. Yet another reason Xen isn’t as widespread as VMware, the theory goes. The question this raises, however, is, “why is p2v even necessary?” Assume the physical hardware had instead died; would this person have had to restore configuration files and settings manually from backup?
Postulate: if you cannot tell a server (be it physical or virtual) to reinstall itself, then
simply walk away knowing it will reload and return to full service, you are doing it wrong.
Every network service, configuration file, and setting required for a server to function, absolutely
must be automated and configured in your configuration management system.
Configured State Versus Running State
Server deployment systems, such as Kickstart, allow administrators to configure most anything to
be set after a server is loaded. Kickstart is designed to install a set of packages, and maybe
configure a few users that should be able to login immediately after the installation of the
operating system. Indeed, you can also copy in configuration files and turn Kickstart into a robust
system that sets up many network services on each server at install time. Another school of thought
on this matter exists, however.
What happens after installation has completed? Immediately afterward, you are in a known
configuration state. Every configuration file on the system is exactly as it was when you copied it
in using Kickstart. Immediately after that, the running state of the machine is unknowable. A
sysadmin could login and “fix” something, changing files and not documenting the change. The current running state of your Apache Web server, for example, very likely has diverged since you first configured it.
If the server were to crash, you might have backups of /etc/, but the restore process is
lengthy, manual, and error prone. Using a proper configuration management system means that all
changes to the Apache configuration will be done in a central place, and then pushed out to the
server. Upon re-installation, the server would immediately fetch its configuration files, required
packages, and other bits again; the configuration same files you have been working from and know
with certainty are the correct ones.
Configuration management does much more than manage files. In fact, newer systems hold strong to
the belief that if you spend most of your time manually managing text files, you are doing something
wrong. Puppet, of course, is the more abstract configuration
management system that allows you to talk about users, groups, and packages rather than the files
that manage those on each type of server you have.
Beyond knowing the state of your server’s configuration files, which is just the simplest
example, configuration management systems also allow you to represent complex dependencies and act
on conditional states. They also store everything in a central place, which allows admins to quickly
verify or change services across the network, automate their monitoring infrastructure, and gather
data about the state and status of their network.
Without configuration management, at least two bad things happen. First, you have no idea how
your systems are configured. Even if you take good notes about the files you’ve changed and the
packages you’ve installed, you will never again reproduce the exact same system. Testing an OS
upgrade, for example, is simple with Puppet. Simply ensure a new test server (or VM) gets the same
configurations and tell it to install with the new OS. Problems can be dealt with before performing
the reinstall upgrade on the production system.
And the second bad thing is that you will never know what/when/where/why/how network services
are running. It is a very bad thing to discover an old FTP server lying around on a server that
hasn’t been patched in years. Proper configuration management practice doesn’t eliminate the need
for security audits, but it does mean you won’t have many surprises.
Some clear indicators you are doing it wrong may be helpful. If you:
- SSH into servers after installing them, to configure services …
- SSH in a for-loop to many servers at once and perform administration …
- login to Web servers to find out which virtual
host sites are running on …
- manually add new servers (and their services) to your monitoring
- make changes to a server without documenting the change and automating it for future
…you are doing it wrong.
Manage more than 50-100 servers, and this quickly becomes obvious. Scaling IT systems either
breeds high levels of automation to make the infrastructure manageable, or it ends up breeding a
Configuration management, then, is really about IT infrastructure management. ITIL preaches a mythical CMDB, or Configuration Management DataBase. In the pay-for-crap world of software
vendors, this usually means someone will sell you something that inventories all software installed
on every server, and generates a pretty report. You can say you are ITIL-compliant (in this area, at least), but you still have no handle on what the infrastructure is really doing, nor do you have
automation and the ability to recreate systems from scratch. The technology exists, but it is a
complex problem and requires a bit of work to get right. Once right, though, you can scale to more than 1,000 servers with little extra work.
When he’s not writing for
Enterprise Networking Planet or riding his motorcycle, Charlie Schluting works as the VP of
Strategic Alliances at the US Division of LINBIT, the creators of DRBD. He also operates OmniTraining.net, and recently finished Network Ninja, a must-read for every network