Welcome!

Virtualization Authors: Maureen O'Gara, Reuven Cohen, John Savageau, Greg Ness, Liz McMillan

Related Topics: Virtualization

Virtualization: Article

Solving the Virtual Machine Backup Conundrum

Best practices for a backup-and-restore strategy in a VMware environment

Just when it seems that you've finally met and solved your data backup and recovery challenges, along comes another kink to complicate the problem.

While server virtualization provides many benefits, it may create a conundrum when determining an organization's best backup and restore strategy.

In a physical server environment, some combination of full and incremental backups of all files, volumes, and servers are most likely being performed. Perhaps you've included a product that provides deduplication and an incremental-only backup strategy that reduces the need to do weekly or monthly full-system backups. Or, maybe you've decreased your reliance on tape for off-site backups and have included a data replication or snapshot technology that will help to quickly restore critical applications or systems in the event of a failure.

But now that your servers have been virtualized, there are new challenges: What should be backed up? You'll certainly want to continue to have backups of all files and application data as you've done in the physical server environment. You'll also want to be able to restore any individual file, directory, or file system from these current backups. But virtualization offers another level of consideration for backup. Since the "guest" virtual machines are presented by the "host" server as files, you can back up the entire virtual machine as just one or several individual file objects. This means that if backed up correctly, you could restore an entire virtual machine by restoring just a file or two.

Let's explore the best practices for a backup-and-restore strategy in a VMware virtual infrastructure environment. When deciding how best to protect VMware virtual machines, it's important to consider your business requirements for data recovery. It's possible a restore of an individual file, file system, or database may be needed. It's also possible a rapid restore of the entire machine is required in the event of a disaster, or you may want to move a virtual machine to a different physical machine for resource load balancing.

There are three basic approaches that can be combined to meet different backup-and-restore objectives:

  1. Install a backup agent in the virtual machine.
  2. Install a backup agent on the ESX Server service console.
  3. Configure a VMware Consolidated Backup (VCB) proxy server and install a backup agent on it to centralize the backups of many virtual machines.

Method 1: Backup Agent Installed on the Virtual Machine
To achieve file level, directory level, and volume level backup and restore, a backup agent in the virtual machine can be installed. This is the same as installing a traditional backup agent for a centralized network backup application on a physical machine, and provides the same capabilities (i.e., incremental backups, full backups, file level restores, and full volume restores), but does not provide for full virtual machine backups and restores.

In the event of a complete virtual machine recovery, the first step would be to install the virtual machine operating system and backup agent. Next, restore all of the volumes backed up for the virtual machine from a full backup. This would be a directory-by-directory and file-by-file restore. This is similar to a bare machine recovery process of a physical server and is typically more time consuming than restoring from a full virtual machine backup as described in Method 2.

Common disadvantages to this method include possible resource degradation during concurrent backup operations of multiple virtual machines on a single host server. Licensing, maintenance, and scalability issues may also be encountered.

Method 2: Backup Agent Installed on ESX Server Service Console
Another method that can be used for backing up and restoring virtual machines is to install a backup agent on the ESX Server service console. The backup agent can then be configured to backup the virtual disk files and virtual machine configuration files (.vmdk and .vmx). These files contain the virtual machine operating system, memory, and all data files. The entire virtual machine can then be restored by simply restoring these few files. However, it's not possible to restore individual data files and directories from this backup. This method is typically faster than creating a virtual machine, installing the operating system, installing the backup agent, and restoring all of the data files as described in Method 1, so it's preferred for disaster recovery and virtual machine relocation.

To ensure a consistent and recoverable backup of the full virtual machine, it's recommended that you pause, suspend, or freeze the virtual machine with one of the following approaches:

  • Back up the inactive virtual machine. This is considered the best way to ensure a consistent backup; however, it requires that the virtual machine be shut down during the backup. This may require pre- and post-backup schedule scripts that execute the commands to shut down the machine, back up the .vmdk files, and the restart the machine.
  • Back up a suspended virtual machine. Again, you'd need to script pre- and post-backup schedule commands to suspend the machine. This method can minimize machine downtime, but restores should be tested and verified to be sure that the backup of the .vmdk files are consistent and restorable.
  • Back up an active virtual machine. Again, using scripted commands in pre- and post-backup schedule jobs; you can make a snapshot of a virtual machine and backup the .vmdk files. This approach essentially freezes the virtual machine, but allows it to continue operating with all updates being sent to a redo log. When the backup is complete, the post-schedule command script unfreezes the machine and the changes in the redo log are committed to the virtual machine. This method provides for "hot backups" of the virtual machine; however, it adds complexity to the backup procedure. Restores should also be tested and verified to ensure consistency of the backups.

Common disadvantages to all of the approaches described above include possible resource contention and degradation during concurrent backup operations of multiple virtual machines on the same ESX server host. In addition, there are maintenance and complexity issues involved in writing, scheduling, and maintaining the scripts required, as well as the inability to restore individual files and directories from the backed-up .vmdk files.

Method 3: Using VMware Consolidated Backup
VMware Consolidated Backup provides a centralized backup facility for virtual machines. It's not a complete enterprise backup solution and is intended only as an enabling technology to be used in conjunction with enterprise backup software.

VCB is essentially a set of scripts and drivers that enables third-party backup software to protect virtual machines through a centralized backup "proxy" server that runs on a Windows server. Doing backups "by proxy" removes the backup workload from VMware ESX servers and virtual machines.

VMware supplies different sets of integration module scripts for different third-party backup applications. You execute these scripts as pre- and post-process commands for the backup agent on the proxy server. During the pre-processing, the script calls the VCB proxy utilities either to create a snapshot of the virtual machine and mount it to the proxy server (for file-level backups) or export the virtual machine to a set of files on the proxy server (for full machine backups).

For file-level backups, the virtual machine continues to operate without interruption during the snapshot process. Changes to the virtual machine while it's in snapshot mode are written to a redo log that's stored in the virtual machine. These changes are applied to the machine when the snapshot is dismounted.

For full machine backups, the virtual machine is suspended during the export. When the export is complete, the machine resumes and the backup agent can then backup the exported files.

After the pre-processing step is complete, users must run the backup agent on the proxy server to perform the backup of a particular drive or drives on the proxy server where the snapshot or exported machine files reside. When the backup is complete, users must execute the post-processing scripts to dismount and delete the snapshot or virtual machine files.

The advantages of using VCB to consolidate backups of virtual machines are that it reduces resource utilization on production virtual machines and hosts and allows virtual machines to continue to operate with little to no impact during backup operations. Additionally, using the central VCB proxy server means it's not necessary to install individual backup clients on each virtual machine.

However, VCB can be complex to implement using the supplied integration module scripts. In many cases, the scripts will need modification. It may also be difficult to schedule the scripts for individual pre- and post-processing of each virtual machine, especially for large numbers of virtual machines. Finally, it may be difficult to troubleshoot anomalies when using the scripts.

Recognizing these difficulties, some backup application suppliers have developed their own integration utilities or agents for VCB. These products simplify the implementation of VCB by providing easy-to-use scheduling and reporting functions, while eliminating the need to use the VMware integration modules.

There are many considerations for backing up virtual environments. Just as when solving the backup needs of your physical environment, you should carefully define your business requirements for backing up and restoring data in the virtual infrastructure. Once you understand recovery time and recovery point objectives, you can begin to implement and test your strategy.

More Stories By Laura Buckley

Laura Buckley is president of Colorado Springs-based STORServer, Inc. STORServer was founded in 2000 to provide simple, adaptable, and scalable solutions for data protection. A privately held corporation with locations in the United States and Europe, STORServer is the manufacturer of STORServer Appliances. Each appliance is an enterprise-wide comprehensive solution that installs quickly and takes just minutes a day to manage.

Comments (1) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Most Recent Comments
osm3um 06/18/09 11:02:00 PM EDT

ESXPress by PHD is a host based solution which performs image based backups, dedup (across all backuped up VMs) backups and allows VM restores as well as file restores.

It is an excellent solution.

Bob (not associated with ESXPress, just a happy customer).