AUTOMATED DISASTER RECOVERY WITH VMWARE SRM AND DELL EQUALLOGIC iSCSI SANs
Organizations of all sizes have embraced vir-tualization as a key technology for consolidating server and storage infrastructure, helping reduce management costs and increase availability. Now IT managers are looking to use virtualiza-tion to help them overcome the challenges of traditional disaster recovery as well as tools to automate the recovery process. Deploying Dell EqualLogic PS Series Internet SCSI (iSCSI) arrays in conjunction with VMware Site Recovery Manager (SRM) software can help organizations implement simple, cost-effective, highly automated disaster recovery for virtualized environments.
ADDRESSING TRADITIONAL DISASTER RECOVERY DILEMMAS
Traditional disaster recovery is challenging in part because it relies on specialized hardware that is expensive and complex. Few IT staffs have the expertise to manage and maintain specialized systems for disaster recovery, which usually require costly outside service and support. IT managers also face significant costs for licensing replication software and leasing the required networked bandwidth between sites. In the face of these costs and management complexity, organizations often can provide disaster recovery only for application-level or departmental implementations. But over time, this approach can leave organizations with disparate, incompatible implementations that are inefficient to manage and provide only partial protection.
Apart from the infrastructure investment for disaster recovery, organizations often lack the internal expertise to manually coordinate site failover for what may be hundreds or thousands of servers. Although small organizations typically have fewer servers to manage than large organizations, they also may lack the resources or expertise available to manually develop recovery plans on their own. A typical recovery plan can include hundreds of detailed steps, from changing cable configurations to bringing recovery site servers online in the proper order, all of which must be fully documented. If an event occurs that requires travel to a remote site, where the recovery documentation must be followed exactly and the primary IT administrator may be out of reach, then additional complications can delay the site recovery.
Plan testing can also be a significant challenge for IT organizations. Testing is essential to help ensure a plan works properly, and may also be required by regulatory agencies or insurance companies as proof that an effective disaster recovery plan is in place.
However, the test process can cause unacceptable disruption to organizations and their customers. Typically, it takes a day or more to repeatedly adjust and retest a plan manually—and because the process involves both the production and recovery sites, the production environment must be shut down. Many companies simply cannot afford to have their services unavailable to internal or external customers for long periods.
CHANGING THE ECONOMICS OF DISASTER RECOVERY WITH iSCSI
Today, the economics of disaster recovery are changing for the better. Remote replication is available for iSCSI storage area networks (SANs). These SANs do not depend on the specialized equipment required by traditional Fibre Channel SANs, and enable organizations to leverage Ethernet infrastructure and IP networking skills already in place—helping reduce training and ongoing management costs.
The iSCSI protocol enables virtual storage implementations that complement and extend the server virtualization made possible by solutions such as VMware Infrastructure. Server virtualization consolidates enterprise application environments, while the virtualized SAN consolidates data assets to create flexible pools of networked resources. Together, server and storage virtualization enable greater scalability, flexibility, and performance compared with traditional all-physical architectures.
Organizations can realize other advantages with iSCSI SANs. The IT environment can be simplified by standardizing on IP networking for server communications, storage access, and off-site replication, further helping reduce complexity and costs. In addition, the lack of distance limitations with IP networking means that a remote recovery site can be located almost anywhere for increased disaster tolerance.
As in physical environments, IT organizations can face formidable challenges when manually developing, testing, and implementing recovery plans for virtualized environments. Tools from VMware and Dell help address these challenges by building on virtualized iSCSI storage to help simplify management and deployment of automated disaster recovery plans.
INTEGRATING REPLICATION OVER IP INTO VMWARE SRM
Dell EqualLogic PS Series arrays and VMware SRM offer an approach to disaster recovery designed to be quick, automated, and economical. PS Series arrays help reduce the complexity and cost barriers of traditional SANs by providing a cost-effective iSCSI SAN infrastructure that can be maintained efficiently by IT staff. They come with Auto-Replication software included, avoiding a major licensing expense and significant recurring software support subscription costs. Dell-engineered SRM Storage Adapter software, available as a download at no additional cost, integrates the PS Series Auto-Replication feature directly into VMware SRM.
The integration of Dell EqualLogic PS Series arrays and VMware SRM through the SRM Storage Adapter software combines the positive economics of replication over IP with automated disaster recovery made possible through virtualization, helping save time and enhance ease of use. This approach enables automated remote recovery and testing for large enterprises as well as automated recovery plan development that helps small organizations overcome the challenges imposed by limited staff and resources. It also advances enterprise reliability, using the redundant, hot-pluggable storage architecture of PS Series arrays and advanced system and disk monitoring capabilities to enhance system availability.
VMware SRM is a new VMware Infrastructure-based solution that provides disaster recovery management and automation for virtualized data centers—integrating tightly with VMware VirtualCenter and Dell EqualLogic PS Series array replication for recovery designed to be rapid, reliable, manageable, and cost-effective. It provides centralized management of recovery plans that not only automates the recovery process but enables enhanced testing of recovery plans. Using VMware SRM, a single IT administrator can configure a disaster recovery implementation quickly and easily (see the "Step by step: Setting up a disaster recovery plan" sidebar in this article).
AUTOMATING DISASTER RECOVERY FOR DATA CENTERS
The native Auto-Replication feature of Dell EqualLogic PS Series arrays helps perform the key disaster recovery function—making copies of data and sending the copies to a remote location at a safe distance from the primary data center. This feature integrates directly into the IP network to help overcome distance limitations. The arrays support one-to-one, bidirectional, or many-to-one replication, and the time interval for replication can be adjusted to meet the needs of the organization.
Requirements for disaster recovery include having a PS Series array and VMware VirtualCenter server at each site. Through the PS Series Auto-Replication software, the arrays are connected to a switched Ethernet fabric and the IP network. The VirtualCenter servers with SRM software can also communicate over the network. The customized Dell SRM Storage Adapter software helps tie the integrated solution together and enables comprehensive, automated site failover.
MINIMIZING MANUAL PROCESSES WHILE RETAINING CONTROL
If the primary site goes down, the volumes are already at the recovery site, and VMware SRM can automatically coordinate the process of bringing the environment online (see Figure 1). SRM runs the entire recovery plan, starting virtual machines in the intended order with updated networking configurations. Many manual procedures associated with traditional disaster recovery are eliminated, but administrators have comprehensive visibility into the execution of the recovery plan through VMware VirtualCenter, and can pause or stop execution as needed.
Another advantage of integrating Dell EqualLogic PS Series arrays and VMware SRM is the Fast-Failback capability included in the PS Series Auto-Replication feature. Fast-Failback helps eliminate the need to retransmit complete volumes when the production site is ready to come back online; instead, the system sends back only the changes that have occurred since the SRM failover operation, helping save time, bandwidth, and expense.