Creating A New Hope: Backups and Disaster Recovery for VMs Print E-mail
By Anil Desai

published: Wednesday, November 14 2007

In the continuing struggle against data loss, it’s important for organizations to understand ways in which you can protect your virtual machines. I’ll use a hypothetical case study to help illustrate the point. Remember the seemingly all-powerful Death Star? I’m sure that the architects of this needlessly-complicated space station thought they’d never experience its complete destruction.

Let’s start with a disclaimer: George Lucas has neither confirmed nor denied the use of server virtualization in The Empire’s data centers (remember, this was a long, long, time ago…). However, I’d like to think that if Darth & the Gang took the time to setup VMs, they’d want to be assured of their ability to recover from oh, let’s say, X-Wing attacks. After all, this organization did experience quite the disaster (Star Wars Episode 4), as well as a fairly quick recovery (Episode 6). In this article, I’ll present ways in which The Empire might have used backup and disaster recovery techniques to help protect their valuable investments.

Start with Recovery Requirements
The most important step in designing a backup plan is to determine your recovery requirements. You should be sure to involve members from throughout your organization, lest you receive a mental throat-choking from a superior officer. Important considerations should include:
  • Acceptable data loss: This is the total amount of data that may be irretrievable lost in the case of a VM or host server failure. You’ll use this information to determine the backup frequency.
  • Tolerance for downtime: Different backup methods necessitate different recovery methods. Factor in the cost of downtime to figure out which types of data protection options are most relevant.
  • Single points of failure: The Death Star’s core was conveniently sized large enough for a couple of proton torpedoes to take it down. Don’t make the same mistake in your environment. Remember that complete reliability should focus on your VMs and the entire infrastructure. That includes network devices, physical server hardware, data center facilities, and just about everything your IT organization is responsible for managing. Focus on the most likely types of failures first, but be sure to consider the big picture (like the explosion of a planet-like space station).
  • Volume of data: Storage space may be getting cheaper, but it certainly isn’t free. VMs tend to be quite large, and planning for data storage is an important consideration. Factor in the following when estimating your storage capacity requirements and limitations:
    • Virtualization host server configuration data
    • VM configuration files
    • Virtual hard disks
    • Virtual network configuration files
    • Saved-state files
  • Budget limitations: Even an Empire has some financial considerations and limitations. Keep in mind your organization’s budget for backup and recovery. There will likely be some back-and-forth between business and technical managers. Overall, both sides should be prepared to make trade-offs based on costs.
  • Expertise: Some backup methods are more complicated than others and might require specialized hardware and training. Keep your IT staff’s experience level in mind when choosing a backup option. For example, before you plan to implement two-phase-commit transactional replication on a SAN-based volume that is also accessible through an iSCSI front-end server, be sure you can manage this setup.

Most organizations will likely use a variety of backup approaches to meet their overall needs. Ideally, you’ll be able to combine this information together to determine different “levels” of recovery requirements.

Performing Guest OS Backups
The first option for performing backups of VMs is to treat them like physical machines. Since VMs typically contain full operating systems, you can use backup agents or scripts to ensure that important information is regularly copied to another location. The primary benefit of the guest-level backup approach is that you can store only important information. For example, in the case of a web server, you might need to store only a small set of content and configuration data since the VM can be easily rebuilt.

There are drawbacks and limitations of guest OS backups, however. They include:
  • Guest OS support: If you’re using an enterprise backup solution, you need to ensure that all guest OS’s are supported. This might not be a problem for standard configurations, but good luck with finding a backup agent for OS/2 or MS-DOS!
  • Complicated recovery process: If you’re choosing to backup less than the entire guest OS, the recovery process often requires multiple steps. If you don’t have an available VM with the right configuration available for use, you’ll need to install the Guest OS, reconfigure applications, and then perform the restore. Furthermore, the process can be wildly different based on the guest OS you’re running.
  • Recovery time: The process of recovering from a failure can be time-consuming, if you need to recreate configuration settings.

Despite the drawbacks, guest OS backups can be easy and cost-effective to deploy and manage.

Performing Host-Based Backups
One of the primary strengths of virtual machines is that they are self-contained units that are stored in a set of configuration and data files. For the purposes of backup, the most important information is stored within virtual disk files which reside on the host file system. The technical challenge is that these files tend to be exclusively locked for read and write access while a guest OS is running. You can perform “cold backups” by pausing or stopping the VM, copying the necessary files, and then restarting it. The problem, however, is that you’ll need some downtime issues. Many first- and third-party vendors provide methods for performing “hot backups”, which don’t require downtime and include the contents of the VM that are consistent to a specific point in time.

The primary advantage of performing host-based backups is that you can use a consistent method of backing up all of your VMs, regardless of their guest OS. This pays off during the recovery process: Generally, all that’s required is to move or copy the relevant files to another host server and to attach and start the VM. When using virtualization-aware backup products, you don’t have to worry about adding new VMs to the backup schedule. You can usually configure these products to automatically backup all VMs on every host server in the environment.

The downsides? Backing up those entire guest OS’s can chew through disk space faster than Jabba the Hutt at an all-you-can-eat buffet! This will often limit the number and frequency of backup operations. Also, you’ll need significant bandwidth to copy the necessary files across a LAN or a SAN.

Implementing Disaster Recovery (DR)
Whether or not your organization is focused on fighting against the scrappy and resourceful Rebellion, you should make sure that your virtual infrastructure is able to survive various types of failures or attacks. I’ll start with the most important point: DR is not simply a point-and-click technical solution. It’s one that involves the entire organization, including people, processes, and IT management expertise.

The technical portion of DR involves maintaining a secondary site or collection of servers that are ready to take over operations in the case of a Rebel attack (or similar issue). The challenge, however, is keeping multiple copies of VMs in sync. File system replication solutions are a good solution. They automatically transmit changes between the “live” VM and a secondary copy stored on another server. Of course, you can also use application-level features such as database replication, web server farm configurations, and other high-availability options to keep the VMs in sync. The primary constraint you’ll run into is limited bandwidth. Sending gigabytes of data each day between Alderaan and Coruscant can be a costly proposition.

Summary
In this article, I presented three major approaches to protecting your mission-critical VMs: Guest OS backups, Host-level backups, and file system replication. You can easily find products from a growing list of first- and third-party virtualization vendors to help you meet these goals. And don’t overlook guest OS and application-level availability and backup features when determining the best way to protect.

If you’re responsible for managing a data center, be sure to use your Force powers of persuasion to stress the importance of backups and disaster recovery for your VMs (remember: these powers work only on weak minds, so they’re best-suited for communicating with middle-management and executives). And one final word of advice: If your organization uses a shield generator, consider locating it on a planet that doesn’t have a scrappy group of furball-like defenders who are sympathetic to the Rebellion (you read it here first!).

Biography


Anil Desai is an independent consultant based in Austin, TX. He specializes in evaluating, implementing, and managing solutions based on Microsoft technologies. He has worked extensively with Microsoft's Server products and the .NET development platform and has managed environments that support thousands of virtual machines. Anil is an MCITP, MCSE, MCSD, MCDBA, and a Microsoft MVP (Windows Server – Management Infrastructure).

Anil is the author of numerous technical books focusing on the Windows Server Platform, Virtualization, Active Directory, SQL Server, and IT management. He has made dozens of conference presentations and is also a frequent contributor to online and print publications. For more information, please see http://AnilDesai.net, or e-mail This e-mail address is being protected from spam bots, you need JavaScript enabled to view it .
 
< Prev