Before Trouble Strikes
In 1992 Hurricane Andrew put 39 major data centers out of commission. And in 1993, the World Trade Center bombing caused 21 data centers to shut down. While you don't like to think about it, every organization, regardless of its size, runs the risk of a major systems outage, such as a tornado demolishing a data center or a building fire destroying the facility and everything in it. A study by the University of Texas found that 85 percent of businesses depend totally or heavily on information technology systems to stay in business, and that a loss of those systems would cost businesses up to 40 percent of their daily revenues.
Disaster can strike at any time. In fact, there are more than 35 types of disasters, ranging from the most common, such as power outages, to the most catastrophic, such as earthquakes. In essence, a disaster includes any type of interruption of service that results from some force beyond the organization's control. Disaster recovery provides systematic procedures for how to react to and how to recover from that ominous external or internal force. Disaster recovery planning, which complements business continuity and contingency planning, ensures the ability of the organization to function effectively if an unforeseen event severely disrupted normal operations.
The following checklist will help the key individuals in your organization to go through the thought process for preparing a disaster recovery plan. The objective is to restore all critical business functions, rather than such disparate functions as only the data center.
Organize the Project
A successful initiative of this magnitude requires support from senior management associated with the organization, a dedicated disaster recovery team whose members have knowledge of critical business systems, and a well thought out planning strategy and testing strategy.
Senior executives responsible for disaster recovery planning will perform the first two steps. The disaster recovery coordinator, working with the appropriate team leaders, should perform steps 3 to 7.
- Determine which senior executive(s) will have overall responsibility for disaster recovery.
- Have this executive appoint disaster recovery coordinator.
- Appoint a disaster recovery team leader for each operational unit, such as server backup or telephone system.
- Convene disaster recovery planning team and sub-teams as appropriate.
- Working with senior executives responsible for disaster recovery, the disaster recovery coordinator should identify the following:
- Scope -- the areas to be covered by the disaster recovery plan
- Objectives -- what is worked towards and what is the course of action that the disaster recovery team intends to follow
- Assumptions -- what is being taken for granted or accepted as true without proof?
Conduct Business Impact Analysis
The disaster recovery planning team should perform this step to identify which business departments, functions, or systems are most vulnerable to potential threats, what are the potential types of threat, and what effect would each identified potential threat have on each of the vulnerable areas within the organization.
- Identify functions, processes, and systems.
- Interview information systems support personnel.
- Interview business unit personnel.
- Analyze results to determine critical systems, applications, and business processes.
- Prepare impact analysis on interruption on critical systems.
Conduct Risk Assessment
The disaster recovery planning team should work with the organization's technical and security person to determine the probability of each functional business units' critical systems becoming severely disrupted and to document the amount of acceptable risk the business unit can tolerate. For each critical system, provide the following information:
- Review physical security, i.e. secure office, building access off hours, etc.
- Review backup systems and data security.
- Review policies on personnel termination and transfer.
- Identify systems supporting mission critical functions.
- Identify vulnerabilities, such as physical attacks, or acts of God, such as floods.
- Assess probability of system failure or disruption.
- Prepare risk and security analysis.
Develop Strategic Outline for Recovery
The steps outlined here provide all of the components necessary to perform a recovery. These steps will help pull together information about the operations of all systems, especially those owned or managed by non-technical managers with help from technical support personnel. Steps one through four mainly apply to functional business units that manage technology systems to process critical functions. The disaster planning recovery team and the functional business unit may wish to appoint other appropriate individuals to perform subsequent tasks.
- Assemble groups as appropriate for the following:
- Hardware and operating systems
- Other critical functions and business processes as identified in
- the Business Impact Analysis step.
- Light, normal, and heavy processing days
- Transaction volumes
- Dollar volume, if any
- Estimated process time
- Allowable delays (days, hours, minutes, etc.)
- Component name and technical identification if any
- Type (online, batch process, script)
- Run time
- Allowable delay (days, hours, minutes, etc.)
- Name and description
- Type (backup, original, master, history)
- Where are they stored?
- Source of item or record
- Can the record be easily replaced by another source?
- Backup and backup generation frequency
- Number of backup generations available onsite and off-site
- Location of backups
- Media key, retention period, rotation cycle
- Who is authorized to retrieve the backups?
- Type (server hardware, software, research materials, etc.
- Item name and description
- Quantity required
- Location of inventory, alternative, or off-site storage
Review On-site and Off-Site Backup and Recovery Procedures
The disaster recovery planning team should perform this task to provide for a current backup of critical program and data that can be used in the even of a disaster. To this end, the disaster recovery planning time can reduce downtime and speed recovery.
- Review current records (operating systems, code).
- Review current off-site storage facility or arrange for one.
- Review backup and off-site backup storage policy or create one.
- Present to functional business unit leader for approval.
Select Alternate Facility
The disaster recovery should perform the task of looking for a location, other than the normal facility, used to process data and or conduct business, in the event of a disaster.
- Determine resource requirements.
- Assess platform uniqueness of unit systems (Macintosh, IBM, Oracle, etc.).
- Identify alternative facilities.
- Review cost/benefit.
- Evaluate and make recommendation.
- Present to business unit leader for approval.
- Make selection.
In part 2, we will cover Plan Development, Testing, and Ongoing Maintenance for your disaster recover plan.