Performing a DR backup on a VMware ESX server

This section of the chapter excerpt will focus on backup solutions provided by the VMware ESX server before a disaster and recovery options after a disaster.

Solution provider takeaway: VMware ESX Server is today's leading virtual infrastructure platform in mission-critical environments. This section of the chapter excerpt from the book VMware ESX Server in the Enterprise: Planning and Securing Virtualization Servers will focus on using the platform for DR backup.

Download the .pdf of the chapter here.

But what is a full DR backup? As stated previously, there are two major backup styles. The first in terms of VMware ESX Server is referred to as a backup for individual file restoration or a backup made from within the VM. The second is a DR-level backup of the full VM disk image and configuration file. The difference is the restoration method. A standard DR backup, using agents running within the VM, usually follows these steps for restoration:

  1. Install OS into machine.
  2. Install restoration tools.
  3. Test restoration tools.
  4. Restore system.
  5. Test restoration.

A full DR-level backup has the following restoration process:

  1. Restore VMDK and configuration file.
  2. Register VM into ESX.
  3. Boot VM.

As can be seen, the restoration process for a full DR backup is much faster than the normal method, which in many cases makes a DR backup more acceptable, but it generally requires more hardware. But what hardware is really the question, and one that needs to be considered for VMware ESX Server. A standard ESX stand-alone ESX Server consists of lots of memory and as much disk as can be placed into the server. A standard remote data store--attached ESX Server consists of lots of memory and very little local disk space, and a boot from SAN (BFS) ESX Server usually has no local disk space, which is not a best practice. Our best practice for installing ESX outlines a need for a very large /vmimages directory and a local VMFS partition, and the main reason for this is to perform safe backups. Figure 12.1 presents backup steps from the point of view of a simple ESX Server connected to an entry-level SAN. Entry-level remote data stores are missing the BC or copy features available on larger remote data stores.

ESX Server

Whether there is an entry-level remote data store, SCSI-attached storage, or local storage, the steps are similar for creating DR backups. There are few ways to create backups, and the methods are similar no matter where the data will eventually reside. DR backups can be made many ways using an equally different number of tools. Built into ESX is the first method, the second is to use VMware Consolidated Backup (VCB), and one of the other tools is to use Vizioncore's ESXRanger. All can be used to eventually place the data on a tape device, local storage, remote storage, or a remote hot site. File restore backups can be made using VCB and other third-party backup agents.

The simplest form of backup is the single ESX Server approach, as outlined in Figure 12.1. Now it does not matter whether the VMFS data stores are SAN, iSCSI, or even local SCSI for this method. This method works quite well with low-cost, no-frills data store solutions like an entry-level SAN without BC features.

In Figure 12.1, there are several DR backup paths and destinations shown that are intrinsic within the ESX version 3 software. All but one of these paths exist in earlier versions of ESX. In addition, some third-party backup solutions provide the same functionality as the intrinsic tools but in a more graphical function.

Backup Paths

Path 1, designated by the solid gray lines, represents a common backup approach for earlier versions of ESX. This approach still provides a level of redundancy that protects a system from catastrophic remote storage failures. This is a full DR-level backup:

  • VMs are exported from the remote or local VMFS datastores to a placeholder on the local machine. This placeholder is usually an ext3 file system, but should be another VMFS designated as a special repository just in case the SAN fails. It would not be available for normal usage.
  • The exported VM files are in turn sent to a remote backup server through a secure network copy mechanism.
  • All but the most important VM files are then deleted from the ESX Server. By keeping the most important VM files locally, we have taken a simple and useful precaution in the event of failed remote storage.
  • The remote backup server sends the data to tape storage.

Path 2 is very similar to Path 1, designated by the short-dash light-gray line. This patch is only available with ESX version 3 and represents the use of the VCB. This can either be a full DR-level backup or per-file backup if the VCB proxy server, Windows 2003, understands the file system to be mounted by the VCB tools:

  • The remote storage data stores are mounted onto a VCB proxy server.
  • The VCB proxy server will mount the VM from the remote datastore and either export the VM to another location on the VCB in a monolithic full DR backup, or provide a means to access the VMDKs for a per-file backup. (Once more, per-file backup is only available if the VCB Proxy understands the underlying file system of the VMDK.)
  • When the VMDK is exported or the per-file backup is finished, the proxy server unmounts the VMDKs of the VM.
  • The data is then sent to tape or a tape server from the VCB proxy.

Path 3 has two branches designated by the dash-dot black lines, and is used to clone VMs from one remote storage location to another; this is a poor man's BC copy that more expensive SANs can perform automatically. The branches can be explained as follows:

  • Upper path: VMs are exported to a local file system.
  • Upper path: VMs are imported or copied to a new file system on the remote storage.
  • Upper path: All but the most important VMs are deleted from the local file system.
  • Lower path: VMs are exported from one file system of the remote storage directly to another file system on the remote storage using service console-based VCB or other service console-based backup tools, including Vizioncore's ESXRanger Pro product.

Path 4 is designated by the long-dash, dark-gray line and is not a recommended path but is there for use in rare cases. This is an alternative to Path 1 in that instead of sending the files to a remote backup server, the data would be sent directly to a tape library using the service console. This path is never recommended because there is a need for an Adaptec SCSI HBA, and when the SCSI HBA has issues, the entire VMware ESX Server may need to be rebooted and may crash.

Modification to Path 3

As can be seen, even for a single-server ESX Server, there are a myriad of paths for backup and BC. Now let's look at a slightly different configuration. This configuration, outlined in Figure 12.2, describes a system where the remote storage has the capability of doing a business copy or LUN snapshot.

LUN snapshot

Figure 12.2 gives us a change to one of our existing backup paths described previously. Specifically, that change is a modification of Path 3.

Path 3 has two branches designated by the dash-dot black lines and is used to clone VMs from one remote storage location to another; a poor man's BC copy that the more expensive SANs can perform automatically. Path 3 and the modification (essentially creating Path 5) work this way:

  • Upper path: VMs are exported to a local file system.
  • Upper path: VMs are imported or copied to a new file system on the remote storage.
  • Upper path: All but the most important VMs are deleted from the local file system.
  • Middle path (previously lower path): VMs are exported from one file system of the remote storage directly to another file system on the remote storage using service console-based VCB or other service console-based backup tools, including Vizioncore's ESXRanger Pro product.
  • Lower path (this is the new path or Path 5): If a SAN is capable of doing a LUN snap or LUN-to-LUN copy, it is now possible to copy the complete LUN using just the SAN software. ESX is unaware of this behavior. However, the caveat is that this produces a crash-consistent backup, whereas the other methods do not.

The LUN snap or LUN-to-LUN copy procedure starts by mirroring the data from one LUN to another LUN, when the mirror is fully built and consistent, creating an instantaneous copy of a LUN with no impact on the ESX Server that would snap it off from the primary LUN. The LUN-to-LUN copy would be crash consistent and happen while VMs are running. Now here is the tricky part; because the new LUN houses a VMFS, and thus all the VMDKs, another VMware ESX Server is required to access the new LUN so that the VMDKs can be exported to a local disk and from there be dropped to tape. This backup processing ESX Server would in general not be used for anything else and may belong to its own farm. For ESX versions earlier than 3.0, a method to transfer the VM configuration files would need to be added, too, but that is a small chunk of data to transfer off the production ESX Servers compared to the VMDKs and could be set up to be transferred only when the files change.

More on disaster recovery for VMware ESX Server

Creating a VMware DR plan

How to develop a DR plan for virtual machines

Learn about high availability guidelines and VMware HA best practices

A crash-consistent backup is one in which the data written to the VMDKs has been halted as if the VM has crashed. Therefore, all backups that are crash consistent have the same chance of being rebooted as if the VM has crashed. One solution to this problem is to quiesce the VMDKs prior to the LUN mirror copy being completed, which would require some way to communicate from the SAN to the ESX Servers. The creation of a snapshot, which VCB does, will quiesce the VMDK so that it is safe to copy. Unfortunately, disk quiescing only happens in Windows, NetWare, and a few other types of VMs, but not necessarily Linux. It also requires VMware Tools to be installed. It is possible, however, to write your own quiescing tools to be run when the VM is placed into snapshot mode.

Additional Hot Site Backup Paths

Figure 12.3 demonstrates multiple methods to create a hot site from the original ESX environment. A hot site is limited in distance by the technology used to copy data from one location to another. There are three methods to copy data from site to site: via Fibre, via network, and via sneaker. In addition to the four existing paths, we can now add the paths covered in this section.

Path 6, depicted by the leftmost dash-dot-dot gray line, is the copying of local remote storage to similar remote storage at a hot site. Path 6 picks up where Path 3 ends and adds on the transfer of a full LUN from one remote storage device to another remote storage device using Dark Fibre, Fibre Channel over IP, or standard networking methods. After the LUN has been copied to the remote storage, Path 2 or Path 3 could once more be put into use at the hot site to create more backups.

hot site

Path 7, depicted by the rightmost dash-dot-dot gray line, is the copying of the local ESX storage to a remote location using a secure network copy mechanism. Path 7 picks up where Path 1, 2, or even 3 leave off and uses standard network protocols to get the data onto a backup server on the hot site. From the backup server at the hot site, the VMs could then be restored to the hot site datastores waiting to be enabled in case of failure. Or from there they could be sent to a remote tape device.

Path 8 is depicted by the long-dash, short-dash line and uses the service console of the VMware ESX Server to create the DR backup described in Path 7. This is not recommended because you do not want to tie up service console resources with a lengthy network traversal. Path 8 picks up where part of Path 1 finishes and would initiate a copy of the data not only to the local backup server but also to the remote backup server. This path is not recommended.

Path 9 is depicted by the rightmost long-dash line and uses the service console to write directly to the tape server at the remote site. This path is not recommended because it will tie up service console resources and requires an IP-based tape device at the hot site. Path 9 would pick up where part of Path 1 finishes and would initiate a remote backup using an IP-based network. However, instead of going to a disk like Path 8 does, it would go directly to a tape device.

Path 10 is not depicted on the diagram, but it would be the taking of backup tapes from the main site to the hot site to be restored or stored in secure storage. This is often referred to as a sneaker-net approach.

Summary of DR Backup

No matter how the data gets from the primary site to the hot site, the key is to get it there quickly, safely, and in a state that will run a VM. To that end, the longdash and long-dash, short-dash paths outlined on all the diagrams should be avoided because they are not necessarily very fast, safe, or consistent. The green or solid line and short-dashed paths provide the most redundancy, whereas the dash-dot paths provide the least impact, and the dash-dot-dot paths provide the better methods to create hot sites or move data from primary site to hot site short of sneaker net. If all these paths fail, however, remember your tapes sitting offsite in a vault and the combination to get access to them quickly. Above all, always test your backups and paths whether local or using a hot site. Always remember, backup and DR go hand in hand.


VMware ESX Server in the Enterprise: Planning and Securing Virtualization Servers
  Disaster recovery and backup - introduction 
  DR Backup
  Business continuity
 ESX Version 2
 Vendor tools
 

About the book

VMware ESX Server in the Enterprise: Planning and Securing Virtualization Servers is the definitive, real-world guide to planning, deploying, and managing today's leading virtual infrastructure platform in mission-critical environments.. Purchase the book from Prentice Hall.

Reproduced from the book VMware ESX Server in the Enterprise. Copyright 2008, Prentice Hall. Reproduced by permission of Pearson Education, Inc., 800 East 96th Street, Indianapolis, IN 46240. Written permission from Pearson Education, Inc. is required for all other uses.

This was first published in September 2008

Dig deeper on Data Backup and Data Protection

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

MicroscopeUK

SearchCloudProvider

SearchSecurity

SearchStorage

SearchNetworking

SearchCloudComputing

SearchConsumerization

SearchDataManagement

SearchBusinessAnalytics

Close