The CIO's Role in Preparing for Disaster RecoveryIntroductionOver the last two decades, the roles of computers and telecommunications systems have evolved considerably. While once only reporting tools, they are now important operational capabilities. A few short years ago, the best use the CFO could envision for them was to report on what happened soon after it happened. The CEO saw his systems as tools to track progress so he could make adjustments toward the achievement of goals. The CIO did not exist. Today, the CFO demands that systems achieve near real-time reporting and control over a wide variety of accounting and related functions. The CEO views his systems as a tool to achieve a competitive advantage over competitors. The CIO was "invented" to make it all happen quickly, transparently and at virtually no cost. Some CIO's have the title and some do not, but they all share the common responsibility for making systems satisfy needs. Be they accountants or artisans, employees use systems to meet specific objectives. The CIO's job is to make systems meet needs, even when those needs are evolving. The charter goes even beyond this -- the CIO ushers the systems through supporting roles that enhance the achievement of the company vision. The result of all this work and care is a network of networks providing high speed, quick response, automatic control and reporting. Often there is a room or a set of rooms containing mainframe and midrange computers along with servers, gateways and routers. Wires and cables of all flavors emanate from these rooms, crawling through ceilings, walls and floors, reaching every cubicle and office. This is the support structure for a facility and for an entire company. Built over decades, no single expert understands it all. No single document contains the information required to rebuild it. Yet the company depends upon it and cannot perform well without it. Then there is the fire or the earthquake, hurricane, flood, ice storm, snow storm or other disaster. Perhaps a water main broke in the parking lot. Maybe the local power feed failed. Possibly the local or long distance phone lines went dead. A bomb threat or a collision involving a truck carrying toxic gas could cause evacuation of the building. A small plane may land on your roof? There are many possible events, including accidents and intentional destruction, that can bring down your well-honed systems. The CIO's job also includes making the proper preparations so that if there is a disaster, then the company can continue operating. After a disaster, perhaps operations will not be as smooth or as efficient, but they will continue if the precautions are taken. This article examines those preparations and the CIO's role in preparing a Disaster Recovery Plan. What is a Disaster Recovery Plan?A Disaster Recovery Plan is a complex set of analyses, preparations and procedures with a single goal: Keep the business running after a disaster occurs while helping it resume normal operations as quickly and intelligently as possible. A Disaster Recovery Plan is not an inventory of existing systems nor is it a guidebook for duplicating those systems. It is an action plan for providing key corporate functions with the capabilities they need when they need them. The Disaster Recovery Plan is know by a variety of other names. Some call it a Business Resumption Plan, others a Contingency Plan. Some say it is a Business Continuation Plan, and others call it a Crisis Management Plan. Whatever the designation, the function is the same. Among disaster recovery planning professionals, these terms are used interchangeably. Establishing the NeedSome consider the need for a Disaster Recovery Plan to be manifestly obvious. Others relegate it to the scrap heap of cost cutting measures. These are both incorrect extremes. The need disaster recovery has evolved in parallel with the evolving of technology. Establishing the need for a Disaster Recovery Plan means assessing why the company evolved to its current state. In the past decade, the meaning of "fast" has changed. It has evolved from meaning respond within days to a requirement to react within hours to an expectation for immediate answers. Interdependencies between technology and operations have also evolved. In times past, systems provided reports. Today, systems control production and operations. An important part of establishing the need for a Disaster Recovery Plan is to assess the criticality of systems to operations, and another is to identify the meaning of "fast." The CIO is in the position to make the determination at the outset. The CIO knows what "fast" means the company and what are the general implications of not responding in the time required. The CIO also understands how different systems work together to support daily operations. The Disaster Recovery Plan protects people, information and equipment. These are the three axes of recovery planning. Knowing how these three elements work together to meet company goals and to perform daily operations is at the heart of the CIO's realm. Without that knowledge, no Disaster Recovery Plan can succeed. ResponsibilitiesA successful plan is a corporate plan supported by top levels of management and built by specialists with cooperation from workers, supervisors and middle managers. The CIO is the key executive who can coordinate support throughout the levels of the company. Communication and coordination among the various levels of the company are the key elements in building a plan that works. Building the plan provides an opportunity to raise awareness of the process and explain the benefits. Systems affect a wide variety of people: clerks using terminals for data entry, financial analysts extracting data from mainframes and analyzing it on PCs, and management using decision support tools. Each must be made aware of how a Disaster Recovery Plan helps protect the function, the job, and the company. Disaster recovery can and does work. Many recent disasters have proven that prepared companies survive and flourish; unprepared companies often fade away. Those that did flourish made it clear from the outset that disaster recovery planning is everyone's job, not just the job of a disaster recovery coordinator or manager. The CIO must accept the mantle of communicating to all levels of management that each must accept responsibility for successful implementation of a plan. The Business Impact AnalysisThe most important tasks in developing a Disaster Recovery Plan are identifying the key corporate functions, the capabilities they need, when they need those capabilities, and how they interact with other functions. The Business Impact Analysis (BIA) defines the scope of the plan, identifies the functions needing quickest recovery, and establishes the recovery strategies. From this information the disaster recovery planner can define the objectives of the Disaster Recovery Plan. The first step in building a BIA is to limit the scope and to specify the environment. This establishes the overall goal of the plan, defining whether the plan will protect a single building, a corporate campus, or an entire company. It also provides the opportunity to begin formal work on the plan by analyzing what systems and capabilities are within the complex or depend upon the complex. The second step it to categorize corporate functions through an interview process using matrix analysis methodology. The planner must interview knowledgeable employees from every functional unit within environment. This provides the data and the perspective of the people doing the work. Besides being a necessary step in the disaster recovery process, this is also an opportunity to learn from the people who know their jobs best. The data collected helps define the objectives of the plan. These interviews reveal the relative time-criticality ranking of all company functions. This ranking provides the means to solve the most critical problems first and the information to decide how many of the corporate functions will be protected within cost and schedule constraints. The interview process is the time to fight the tired, old excuses ever disaster recovery planner has heard. "When the big one comes, we're all finished anyway, so why bother?" "If a disaster strikes, I'll just quit and find another job!" "During a crisis, nobody will care about the company, we will all be worried about our families and homes!" While interviewing to collect information, the planner can also spread the word: the Disaster Recovery Plan protects the employees as well as the company. There are good answers to these excuses; the interviewer must take the time to address these concerns. The third step is to establish the recovery strategies and justify the cost. The CIO stands at the nexus of information technology, operational processes and corporate policy. That is the best place from which to understand the need, strategize on the most effective methods for recovering and to justify the cost. The cost justification can be presented as a contribution to a daily loss for each Category of functions. In this type of presentation (shown at left), the functions are grouped according to a clustering of their time criticality measures. Each group is assigned a Category number so that Category I is considered the most time critical and the other categories are less critical. A bar chart, such as the one shown, can graphically illustrate the revenue and other losses incurred through the absence of that function for each day, as of the day the impact begins. Another useful presentation view is a Cumulative Loss Summary, in which corporate losses contributed by each function are graphed against the time since the disaster. This chart illustrates well the total losses over time. This is often a sobering view. Taken together, the two charts are used to help make a reasoned decision on the number of functions to be protected and over what time period to implement the protections. Implementing the PlanOnce the needs have been established, the strategies developed and the funding allocated, the time comes to write the formal procedures. There are two types of disaster recovery procedures: action plans and information sheets. Action plans document clearly specific actions to perform. Information sheets provide data necessary for performing those actions. Action plans implement specific recovery strategies. For example, activating a reserve system for a critical function may require some key individuals to travel to a location, power up some systems, perform a readiness test, and call the command center. These steps, along with some extra detailed instructions, constitute an action plan contained in one procedure. Good procedures are easy to use, including charts illustrating the steps and easy to follow instructions. In an emergency ease of use is paramount. Information sheets provide background information needed by a variety of disaster recovery personnel and teams. For example, the list of Disaster Declaration authorities, along with their work and home phone numbers is in a fact sheet that is distributed to disaster recovery personnel. Lists of disaster recovery team personnel and their contact numbers are also documented in information sheets along with the phone numbers and locations of command centers and other backup operating locations. Phone trees are another example. When bundled together in a small book for a particular team member, these action plans and information sheets contain all that this team member needs to know to perform the recovery function. However, this framework of procedures means that very many procedures and information sheets will be written. The Disaster Recovery Plan or Business Resumption Plan is already a complicated set of documents. Keep the procedures as simple as possible and avoid burdening people with unneeded information. One good tool for accomplishing this goal is to modularize the plan. Build a comprehensive master book containing all procedures. Then, extract from this book sets of subsidiary packages, each intended for a particular team or team member. In the subsidiary include all the procedures (action plans and information sheets) that individual is likely to need. The subsidiary package should be bound in red, well marked, and no more than 10 to 12 pages for most packages. It is the domain of the CIO to review these subsidiary packages to be certain they are small, simple and useable. Clearly the construction of the detailed strategies and the writing of the procedures are tasks best delegated by the CIO to staff. Since the CIO will therefore not be involved in these details, this leaves the CIO in an ideal position to ensure that the resulting packages and preparations are understandable. In this role, the CIO becomes protector of simplicity in the disaster recovery project. This is a fitting role for the manager of the complex, high technology world of information technology.
|