Microsoft Exchange disaster recovery
It's no picnic to retrieve lost data from an Exchanger Server crash, so find out how to prevent it and, in the event of a crash, how to restore your database.
If you've ever experienced a Microsoft Exchange Server crash, you've probably found out that it's no picnic to recover--Microsoft Exchange uses Jet databases, which must be backed up and restored in a totally different manner from that of other operating system files. In this article, I'll explain what's involved in correctly backing up Microsoft Exchange, and discuss some techniques you can use to restore your databases should an Exchange Server crash.
What's involved in an Exchange backup
|"Restoring an Exchange database is often difficult, because Exchange databases are subject to corruption. "|
Aside from the normal Exchange system files, Exchange uses three databases: priv.edb, pub.edb, and dir.edb. As you can guess from the EDB file extensions, these databases use the Jet database format. And like many databases, they can't be backed up through conventional methods when the database is in use. The databases are in use any time the corresponding Exchange services are running. For example, if the Microsoft Exchange Directory service is running, the system considers the dir.edb file be in use. Likewise, if the Microsoft Exchange Information Store service is running, the priv.edb and pub.edb files are considered to be in use.
Because of the way Exchange keeps database files open, you have two options for backing them up. The first option is to use special backup software. When you install Exchange, it modifies the Windows NT Backup program to make it Exchange-aware. As a result, the Backup program knows that Exchange exists on the system, and the program contains extra code that enables it to back up open databases. If you're using other software (such as Arc Serve or Backup Exec), you'll have to get an add-on module to make your backup software Exchange-aware.
Even with special backup software, the fact remains that you can't back up an open file. Doing so defies one of the basic laws of computing. Even if you could back up an open file, the backup is invalid if a single change is made to the file during the course of the backup. To get around this problem, the Exchange-aware backup software temporarily locks the Exchange databases during backup. Throughout normal operation and during backup, Exchange Server uses transaction logs. These transaction logs store all the database transactions that haven't been committed to disk. While the backup software is running, all database changes are written to the transaction logs rather than to the database itself. When the backup completes, these transaction logs are processed to update the database. When the update is complete, Exchange purges the transaction logs.
Another way to back up Exchange Server is through an offline backup. An offline backup is nothing more than a file-level backup of the priv.edb, pub.edb, and dir.edb files. Again, the Microsoft Exchange Directory service and Microsoft Exchange Information Store service must be interrupted before you can perform a file-level backup. An offline backup is good if you don't yet have the specialized backup software required for an online backup.
As you're probably aware, most backup operations are performed in the middle of the night when you're not (I hope) at the office. As a result, if you need to perform an offline backup, you must automate the process of starting and stopping the various Exchange services. Doing so is easy: simply create a batch file that uses the Net Start service name command to halt the services. You can then use the Net Start service name command to restart the services later. To schedule the batch files to run at the appropriate time, use the AT command.
Keep in mind that if your Exchange databases are very large, it may take several minutes to stop and start each database. It's also important to note that the databases must be stopped and started in a specific order. You must stop the Message Transfer Agent (MTA) first, followed by the information store, and finally the directory service. There's no need to stop the system attendant during an offline backup. When you restart the services, you must start them in the opposite order: the directory service, the information store, and finally, the MTA.
When a crash occurs
Let's suppose that Exchange crashes, and you have to restore the databases. The ease or difficulty of doing so is directly related to how badly, and why, the server has crashed. In the next two sections, I'll discuss a general restore procedure. These procedures assume that the main Exchange Server components are still functional and that the databases have failed due to a hard-disk failure or some other catastrophe.
An online restore
Obviously, in such a situation, an online restore is by far the easiest to perform. The exact method for performing an online restore varies among backup software. However, it's usually as simple as selecting the databases that you want to restore from tape. Upon doing so, the backup software will check to see if any of the Exchange services that control the various databases are running. If they are, then the backup software will automatically stop the services. When the restore completes, you'll probably have to restart the Exchange services manually.
An offline restore
An offline restore is a little more complicated. When performing an offline restore, the first thing you must do is to make sure that the Exchange services have stopped. Once the services are stopped, copy the database files to the appropriate directories. Then, start the directory service.
When the directory service starts, it means that--at the very least--you've managed to recover the empty mailboxes (which is sometimes a great accomplishment, as I'll explain later). Next, you must recover the information store. However, before you can start the information store, you'll have to make sure that it matches the directory service. To do so, first try starting the information store service without doing anything other than placing the priv.edb and pub.edb files into the correct directory. If the information store starts, you're in business. You can start the MTA, and you're done. However, if the information store won't start, you'll have to synchronize it. To do so, go to the \EXCHSRVR\BIN directory and run the following command: isinteg -patch. This command may take some time to complete if your information store is very large, but after it does does, you should be able to start the information store and the MTA services.
|"Sometimes, a database may be damaged so badly that nothing seems to be able to correct the problem. There are several techniques to ensure that all isn't lost. "|
So far, I've shown you an ideal restore procedure. But as I mentioned earlier, restoring an Exchange database is often much more difficult, because Exchange databases are subject to corruption. Often this corruption starts as a minor database glitch. As time goes on, the problem escalates. When this happens, you may not even notice a problem until you stop and try to restart the Exchange services, only to find that they won't start. The scary thing about such a situation is that it might mean that you've been backing up corrupt data for an extended time, and your backups are probably just as corrupt as the databases that won't start.
In this situation, you can do some things to try to get your databases back into a usable state. First you must determine which databases are corrupt. For example, if the directory service won't start, the dir.edb file is probably the culprit. If the directory service starts, but the information store won't, the priv.edb or pub.edb file (or both) may be corrupt.
Once you've identified the corrupt database, you must restore the database to a consistent state. To do so, open an MS-DOS Prompt window and switch to the \EXCHSRVR\BIN directory. In this directory, run the following command: eseutil /g /database name /x. In place of database name, use ds for the directory service, ispriv for the private information store, or ispub for the public information store. Running this command will test the integrity of the database and tell you whether the database is corrupt or in a consistent state.
If the database is in an inconsistent state, run the following command: eseutil /r. This command will bring all the databases to a consistent state. After running this command, you should be able to restart the databases.
If you receive an error message when running this command, or if the databases still won't start, there's a chance that the databases could be corrupt after all. If you suspect database corruption, you can try running the following command to repair the database. However, be very careful about using this command: It repairs the database by deleting anything that it doesn't understand. The command is eseutil /p /database name.
Occasionally, you run into a situation in which the entire server dies. If this occurs, the databases were probably OK, so you won't have to go through all the steps I discussed for recovering a corrupt database. Instead, you'll need to reload Exchange from scratch. Doing so presents a unique problem in multiserver environments: At one point during the installation process, the Exchange Setup program will ask if you want to create a new site or join an existing site.
When this happens, the natural inclination is to join the existing site. However, the site already contains a record of the server that you're trying to install. Therefore, you won't be able to add the server to the site. If you try to remove the record of the server from the site and then join the existing site, you've just ruined any chance of recovering the server. This is because the SID number associated with the newly installed copy of Exchange doesn't match the SID number stored in the backed up directory service. When you restore the directory service, the server won't be able to attach to the site, because the SID number in the directory service won't match the SID number that the rest of the site thinks the server should have.
To get around this issue, select the option to join a new site. Assign the server a bogus site name. Later, when you restore the directory service, the bogus site name will be overwritten with the correct site name, and the Exchange server will belong to the correct site once again.
If nothing works
Sometimes, a database may be damaged so badly that nothing seems to be able to correct the problem. If this happens to you, you can use several techniques to ensure that all isn't lost. Often, only one of the three databases will be corrupt. For example, you might have a corrupt dir.edb, but the priv.edb and pub.edb files will be fine. In such a situation, there are methods you can use to regenerate the corrupted file.
For example, in desperate situations, you can rebuild the dir.edb file from a text file or by running a DS/IS consistency check. Use extreme caution with such techniques, though, because they have very serious side effects on other servers within the organization if used incorrectly.
You can also use the default Exchange databases in conjunction with databases that you were able to salvage. By doing so, you'll only lose the contents of that one database rather than the entire server. //
Brien M. Posey is an MCSE who works as a freelance writer and as the Director of Information Systems for a national chain of health care facilities. His past experience includes working as a network engineer for the Department of Defense. Because of the extremely high volume of e-mail that Brien receives, it's impossible for him to respond to every message, although he does read them all.