Efficient, Portable and Extensible Back up of VMs

In the knowledge industry, where data and information are the most important organizational assets, their protection and integrity are highly critical. With the ever increasing usage of cloud and virtual infrastructure for processing and storage of data, the backup of these virtual processing nodes, also known as virtual machines (VMs), are equally important.

This article describes the need for taking backup of the virtual machines and explores various ways of making a VM backup. It also defines the concept of “service instance” as a set of VMs and proposes two algorithms, one for taking the back up of service instance and the other for restoring it to the target virtualized environment.

To deal with the problem of data loss, the most fundamental and reliable solution is taking regular back ups of all the data and keeping it at the safe place with restricted access. The same is true for VMs, as well.

Backup of VM in a virtualized environment

There are various reasons which necessitate creating a back up of the whole VM rather than only data it processes. They are:

  1. The VMs can be cloned or replicated from another virtual machine, which is not the case with the physical machines. So if you have one virtual machine with all necessary software configurations done in it, also known as the master copy VM, you can create as many instances of virtual machines as possible from it. Keeping the back up of the master copy VM will be of great help in case of system failure.
  2. In the case of physical machine crash, if the secondary storage devices are intact, one can recover the complete data available on secondary storage device, which is not the case for virtual machines. If the VM image file is deleted accidently or corrupted, all the data in the VM image will be lost.
  3. Taking snapshot of the VM at regular interval will allow you to go back to a particular state of the VM. This technique is very helpful in testing and in research environments.

Various ways of taking backups of VMs

There are various ways of making back ups of VMs in different formats. The selection of a particular way depends on the purpose of the backup. What follows are some of the popular ways of taking backup of virtual machines:

Taking the snapshot of the virtual machine – While the VM is running, you  can take the snapshot of the VM. In principle, snapshots are not back ups but they preserve the state and data of a virtual machine at a specific point in time. The data includes all of the files that make up the VM such as disks, memory, and other devices.

The snapshots are not for permanent or long term back up purposes, but is useful only during the life time of the VM. Generally, it is the time from when the VM is created to the point when the VM gets deleted. The snapshot of the VM is useful when the running VM gets corrupted or you want to go back to the previous known state of the VM.

Copying the whole VM image and related configuration files – This procedure includes coping VM images and its configuration files from virtualized environment to other storage medium, generally outside of the virtualized environment. Later on, this VM image and configuration files can be used to restore the VM image in case of a crash or revert back to the previous state of the VM.

The limitation is that the copied VM image and other configuration files are virtualization-technology dependent, which means the backed up VM image can only be re-provisioned on the same hypervisor. The version of the hypervisor platform may also cause problem in future while provisioning backup VM images. This can happen if there is a change in the format of configuration file or if there is a change in the way the VM images are created by hypervisor.

Storing the virtual machine in open virtualization format (OVF) – The third option is converting the VM into the OVF and storing it as a back up in offline storage outside the virtualized environment. Later, the same OVF can be redeployed to restore the VM. An OVF package consists of several files placed in one directory. Generally, they are VM image files with VM configuration information in XML format.

Service instance

We define service instance as a set of VMs having a set of specific software products and applications already installed. A running service instance provides a specific service to the specific set of users. The number of VMs in a service instance varies and keeps changing during runtime based on service level agreements (SLAs) and the number of user requests the service is handling from time to time.

An example of the service instance can be a set of VMs hosting an application server in a virtualized environment. As part of this service instance, one or more application engines, a load balancer, request scheduler and other supporting software may be running in one or more than one virtual machines at a time. The information about the service instance, such as name, ownership information, target environment information, temporal information and the information about all the VMs that belong to the service instance is stored either in database or in form of a metadata file (preferably an XML document) or in both.

The following section describes the procedure to back up the service instances and restore backed up service instances. The procedures are explained in form of two algorithms, each for respective operations.

Taking a back up of service instance

Once a service instance is created and configured, it is advisable to take a back up of it and store it as a master copy outside the virtualized environment. Taking periodic back ups is also equally important. The master copy of the service instance and the periodic back ups are very helpful to quickly recover from unwanted situations, which can be:

  • Datacenter crash due to natural calamities or human activities.
  • Moving to a different virtual infrastructure provider.
  • One or more VM crashes or VM file system get corrupted in the service instance.

It is very important to have the back up not tied to any particular hypervisor or processor architecture. The back up should be done periodically without disturbing the running service instance. Such requirement can best be served by taking back ups in form of OVFs, since OVF is independent of any hypervisor or processor architecture.

We propose the following algorithm for efficient backup mechanism in the VMware environment, but the same can be applied to other virtualization platforms also based on the availability of the features needed for the OVF backup:

Algorithm-1: Backup Service Instance


For each VM in the service instance, repeat steps 2 and 3.


If (VM is not a template and VM power state is “Powered On” or “Suspended”)




1.     Convert the running VM into a template (since it is not possible to export OVF of a running VM in a VMware environment).

2.     Export the OVF of the VM template to the target location.

3.     Delete the template after the OVF got exported successfully.


else if (VM is a template or VM power state is “Powered OFF”)


1.     Export the OVF of the VM template to the target location.


end if


Update the information for the backed up VM such as user name, VM name, target location, time, backup version number and other custom information into the service instance metadata.


Export the service instance metadata to the target backup location. The same will be used at the time of restoring the service instance.

Restoring backed up service instance

The following algorithm describes the steps for restoring the backed up service instance to the target virtualization environment.

Algorithm-2: Restore Service Instance


Let user specify the version number of the service instance to be restored from the available backup versions.


From the service instance metadata, find out the previous backup information and check the existence of OVFs corresponding to all the VMs in the service instance.


For each VM in the selected service instance, repeat step 4.


If (VM is not a template and VM power state is “Powered On” or “Suspended”)




1.     Store the VM power state.

2.     Power off the VM.

3.     Delete the powered off VM from the virtualized environment.

4.     Import and deploy the OVF corresponding to the deleted VM from the backup location.

5.     Either power on or suspend the VM based on the stored power state as in (1).


else if (VM is a template or VM power state is “Powered OFF”)


1.     Delete the VM from the virtualized environment.

2.     Import and deploy the OVF corresponding to the deleted VM from the backup location.

3.     If VM was a template, then convert VM into template.


end if



Update the restore information into the service instance metadata.

Kamalkumar Mistry is a technology analyst at Infosys, Ltd. India. Kamal holds master’s degree in Information and Communication Technology (ICT) from DA-IICT, Gandhinagar, India. He has around five years of industry experience designing and developing software across a variety of industries such as Aerospace network system simulation, Cloud computing, Big Data Analytics and applications based on various virtualization technologies. His research interest includes designing routing protocols and algorithms in Mobile Ad hoc Network and Delay Tolerant Networks, Network simulations and various virtualization technologies. 

Latest Articles

Follow Us On Social Media

Explore More