Backup Over the Net With Amanda
Amanda manages backups for your UNIX systems over the network with speed and reliability to spare. Carla Schroder takes a two part look at a backup suite that doesn't need a babysitter.
AMANDA, the Advanced Maryland Automated Network Disk Archiver, is a public domain backup utility developed at the University of Maryland. Amanda was originally designed to take advantage of large-capacity tape drives, and will back up to other media as well. It will even perform tapeless tape backups- no I am not making this up, this will be explained presently.
Amanda scales nicely from backing up a single machine to a large network. It also supports tape stackers. The only gotcha is that any single dump image cannot be larger than a single tape. Future versions will have the ability to span a single dump image over several tapes. For now, multiple-tape backups must be broken up into several dump images. Since Amanda runs only on UNIX systems, use Samba to back up Windows clients. A native Win32 client is also available, we'll look at it in part 2. Sorry, no Mac yet.
It is designed to run without constant intervention, needing no attention other than the usual conscientious monitoring any good admin does. Initial setup is rather complicated. After that Amanda pretty much takes care of everything. If you're using a stacker you don't even need to worry about changing tapes.
Tapeless Tape Backup
Amanda handles errors in such a lovely fashion- clients that are hung or otherwise not available are simply bypassed, and the error logged. Even tape defects won't derail the backup- data will sit on the holding disk until the problems are repaired. Then manually move the data to tape with amflush. (All Amanda commands start with 'am': amdump, amrecover, amverify etc., and each one has its own man page. man amanda is the master reference.)
Amanda uses standard UNIX utilities: dump, gzip, and GNU tar. Scheduling is handled by good ole cron. This means you won't need Amanda to perform a restoration, simply trot out the usual suspects: dd, mt, and gunzip. Of course using amrestore is preferable, as it finds things and put them back a lot faster. But it's nice to have a plan B. Remember: no one cares about how wonderful your backups are- only your restores.
Amanda supports multiple configurations, for example, running a periodic archival backup in addition to a daily backup. This works great on a stacker. The additional backups run at the same time, so it does not take longer. With this exception: you don't want to have clients scheduled for an archival and daily backup at the same time.
Going To The Dump
Amanda likes to manage dump cycles its own self, so you won't have the usual "if it's Friday it must be full dump day." Rather, full dumps are performed throughout the dump cycle, in order to balance the amount of data backed up on each run. Amanda keeps logs to tell you where everything is. Again, several options: shorter intervals between full dumps make for faster restores. But they take longer and use more tape. Longer dump cycles spread the load better, but make restores more complicated. Cycles are typically seven days or fourteen days. This is not an absolute value, but an upper limit on how often full dumps are done. Amanda may do more full dumps during a cycle, depending on the total load. It has a mind of its own, and thinks for itself quite smartly.
There are several data compression options. Having the client perform compression reduces network traffic. On the server side, it moves the load off the client. However, as Amanda does many tasks in parallel, the increased load could slow it down quite bit. Hardware compression can be fast and efficient, just remember to take into account what it will take to perform a data restoration. The cautious admin might not compress extremely sensitive data at all, to be sure of being able to restore it. Consider also the speed of the client machine. There are some amusing comments in the sample /etc/amanda/DailySet1/disklist file:
## Some really slow machines, like Sun2's and some Vaxstations, take
## forever to compress their dumps: it's just not worth it.
Consider the longevity of your archives as well. Generic UNIX tools may have a longer lifespan than specialized, proprietary tools. How will you read your lovingly crafted archives 30 years from now? (The printed page still holds up best, but that's a topic for another day.) Even a five- or ten-year projection is iffy- can you still read 5.25" floppy disks?
Amanda does its job better with software compression. This allows it to precisely calculate tape size and compression rates. If you want to use hardware compression, be extra generous in your tape storage allowances, as Amanda won't be able to size jobs up as accurately. Don't use both software and hardware compression! It doesn't work.
Amanda requires both server and client software. Here's a tricky little gotcha that trips up new users: the server only knows to look for client software, such as tar and Samba, in one location. You can't configure it to look for things in multiple locations, and who needs such a configuration headache anyway? So they must be in the same place on all client machines. This shouldn't be a big deal, but various Linux distributions and versions like to move things around. Standard Unix commands should always be in /bin: tar, gzip, dd, mt. No need to move apps, just put symlinks where Amanda can find them.
Your Linux distribution should have Amanda on it already, client and server. The easiest way to install it is from your installation CDs.
In part 2 we'll look at installation more closely, and run through some configuration options. Please see the Amanda Web site for links to mailing lists, where there is an active and helpful user community.