IT channel takeaway: No company should take disaster recovery lightly. This collection of key DR considerations and best practices can help you jumpstart
If the devastating hurricanes of 2005 have taught businesses along the Gulf Coast of the United States one painful lesson, disaster recovery (DR) and protection of vital business data should not be ignored. While many of these businesses understand all too well the operational and financial impact of disruption to their business systems and processes, plenty of others still consider DR something akin to getting a root canal -- a painful and costly procedure that is best avoided for as long as humanly possible.
How risky is putting off DR planning? According to an IDC report, over 90 percent of companies fail within one year of a significant data loss. The Disaster Recovery Journal estimates that as many as 80 percent of all U.S. companies and 90 percent of European companies don't have an effective DR plan. Based on these statistics, it is apparent that the vast majority of companies are essentially playing Russian roulette with their systems and data, and ultimately, the future survival of their businesses.
The disaster recovery strategy's silver lining
Preparing for a disaster or disruption of any kind requires a comprehensive approach that encompasses hardware and software, networking equipment, power, connectivity and test procedures that ensure DR and restoration are achievable within recovery time objective (RTO) and recovery point objective (RPO) targets. While implementing a thorough DR and data protection plan isn't a small task, the potential benefits are significant.
Beyond all this talk of DR gloom and doom, there is a silver lining. The good news is that new technologies and methodologies available today have made implementing a DR strategy more affordable, efficient and reliable. The benefits of such a strategy go beyond DR, resulting in process improvements that will make daily operations less costly and easier to manage.
Potholes on the road to DR success
Before examining some of the ways organizations can employ new technologies and methodologies to implement successful business continuity and DR plans, let's first look at the common stumbling blocks. One of the main reasons that instituting an effective DR strategy is so costly for most organizations is the need to mirror all equipment in the primary data center. Upgrading and maintaining both a primary and remote facility requires a significant investment of resources.
Another challenge organizations face is recognizing that DR planning must be part of a larger business recovery strategy. This means understanding all of the requirements, processes and interdependencies associated with the organization's primary business activities. An organization's DR strategy should look at the big picture, taking into account all aspects of its business needs.
Reliance on tape backups as a primary means of DR preparedness is a risky proposition at best. While they do present one form of cost-effective restoration capability, tapes are prone to failure. In fact, the average failure rate for tape drives and media is much higher than the alternatives of disk-based replication or backup. Secondarily, tapes stored off-site may not be readily accessible during a disaster, which can further hinder the timely restoration of systems, and prove to be extremely costly. Furthermore, off-site tapes or replicated data should be located at least 120 miles away from the corporate data center, so that widespread disasters, such as the hurricanes of 2005, don't impact both the primary data center and secondary site or off-site storage facility, which could radically impact data recovery.
Establishing clearly defined DR roles, responsibilities and the proper chain of command are critical in order for an organization to respond to a crisis. Failure to establish such a structure can result in what the military commonly refer to as the "fog of war." The stress and disorganization that typically surround a disaster leaves most businesses incapable of executing effective recovery procedures. DR planning requires thorough organization and precise execution, and must encompass equipment, personnel and logistics.
Having a DR plan is obviously important, but without thorough testing and documented recovery plans, the effectiveness of any plan is only theoretical. For many organizations, end-to-end testing is considered too cumbersome and disruptive to operations, so it is simply replaced by more limited testing. Because of the complex interconnections between multiple systems and applications, complete testing of recovery procedures is critical for organizations that want to have confidence in their plan. Testing is also the only way to uncover flaws or problems that might otherwise go unnoticed until it is too late. As new applications and systems are added, modified or removed, it is important for organizations to regularly update their DR plans and include these changes in their testing protocols.
The benefits of archiving
As business data continues to grow at an exponential rate, and pervasive regulations such as Sarbanes-Oxley, HIPAA, and the Patriot Act compel organizations to store data longer, archiving presents a meaningful solution for dealing with DR and operational challenges. By building a tiered storage infrastructure and setting in motion information lifecycle management (ILM) practices, an organization can maximize their resources. Archiving allows a business to allocate different types of data and applications to different classes of storage based on performance, availability and recovery requirements. By understanding the relative value of data, companies are then able to maximize their storage investments by maintaining only critical information on costly, high-performance storage platforms while off-loading less essential data to less expensive, lower performance devices.
Furthermore, organizations that couple tiered storage architectures with policies to automate the migration of data to the right platforms can also benefit from reduced backup and recovery times. The number and type of storage tiers can vary according to an organization's need. Organizations can add additional tiers, such as ATA disk technology, to accommodate information in moderate demand, as well as content addressable storage (CAS) and WORM drives for materials that require long-term archival and guaranteed authenticity, such as those related to compliance regulations. The ATA and CAS tiers put into play for archiving can also help minimize backup costs because companies can begin to remove static data from their daily backup schemes, which will reduce their investment in tape while simultaneously shrinking their backup windows. Organizations are also able to achieve full backup and recovery much faster because there is less critical data to move.
Server virtualization mitigates DR challenges
The process of recovery after a system failure is usually a complicated scenario which requires significant IT manpower. If a cold site is used as a DR location and backup is tape-based, restoring systems may take an extended period of time. Add to this the chaotic circumstances surrounding a disaster situation, and the potential for failure rises dramatically. In fact, nearly 25 percent of restores from backups are subject to errors.
Using the latest technologies surrounding virtual infrastructures, organizations can alleviate the need for many of the most costly and time consuming recovery processes. Virtualization allows multiple virtual servers, with heterogeneous operating systems, to run on the same physical hardware, while maintaining system isolation. Because each virtual server environment (including data, application, operating system, BIOS, and virtualized hardware) is saved as a single file, applications can be restored to any hardware with a virtualization platform easily and rapidly. Through virtual infrastructure, all aspects of business continuity can be improved, including faster, more flexible, and more reliable DR at a lower cost. Virtualization also offers the added benefit of allowing thorough DR testing without the cumbersome need for additional hardware.
Don't underestimate "run books"
During the development of a thorough DR plan, it is important to build not only the plan for recovering the enterprise, but also the recovery of each system. The creation of run books documenting all functional areas within IT, including applications, databases, networks and servers is essential to a well administered DR plan. The run book combined with the larger DR plan gives your organization a detailed process to follow to recreate your systems and protect your business from the smallest to the largest disaster. Along with recovery objectives, key contacts and responsibilities, step-by-step recovery plans and testing protocols, a run book is worth its weight in gold.
There is no reason any organization should play Russian roulette with their business data. By leveraging new technologies, such as tiered storage, archiving and virtual infrastructures, companies can make DR planning more affordable, efficient and reliable. In addition to helping avert disaster and allowing an organization to get back online within RTO and RPO goals, today's DR solutions can make a significant impact on the cost and manageability of daily operations.
About the author: Richard Bocchinfuso is vice president and CTO of MTI Technology, a leading multi-national provider of consulting
services and comprehensive information infrastructure solutions for mid to large-size
This was first published in September 2006