Efficient, Portable and Extensible Back up of VMs
Just like the data they process, virtual machines should be backed up and stored regularly to avoid having to redo all of your work, writes Kamalkumar Mistry of Infosys Labs.
In the knowledge industry, where data and information are the most important organizational assets, their protection and integrity are highly critical. With the ever increasing usage of cloud and virtual infrastructure for processing and storage of data, the backup of these virtual processing nodes, also known as virtual machines (VMs), are equally important.
This article describes the need for taking backup of the virtual machines and explores various ways of making a VM backup. It also defines the concept of "service instance" as a set of VMs and proposes two algorithms, one for taking the back up of service instance and the other for restoring it to the target virtualized environment.
To deal with the problem of data loss, the most fundamental and reliable solution is taking regular back ups of all the data and keeping it at the safe place with restricted access. The same is true for VMs, as well.
Backup of VM in a virtualized environment
There are various reasons which necessitate creating a back up of the whole VM rather than only data it processes. They are:
- The VMs can be cloned or replicated from another virtual machine, which is not the case with the physical machines. So if you have one virtual machine with all necessary software configurations done in it, also known as the master copy VM, you can create as many instances of virtual machines as possible from it. Keeping the back up of the master copy VM will be of great help in case of system failure.
- In the case of physical machine crash, if the secondary storage devices are intact, one can recover the complete data available on secondary storage device, which is not the case for virtual machines. If the VM image file is deleted accidently or corrupted, all the data in the VM image will be lost.
- Taking snapshot of the VM at regular interval will allow you to go back to a particular state of the VM. This technique is very helpful in testing and in research environments.
Various ways of taking backups of VMs
There are various ways of making back ups of VMs in different formats. The selection of a particular way depends on the purpose of the backup. What follows are some of the popular ways of taking backup of virtual machines:
Taking the snapshot of the virtual machine - While the VM is running, you can take the snapshot of the VM. In principle, snapshots are not back ups but they preserve the state and data of a virtual machine at a specific point in time. The data includes all of the files that make up the VM such as disks, memory, and other devices.
The snapshots are not for permanent or long term back up purposes, but is useful only during the life time of the VM. Generally, it is the time from when the VM is created to the point when the VM gets deleted. The snapshot of the VM is useful when the running VM gets corrupted or you want to go back to the previous known state of the VM.
Copying the whole VM image and related configuration files - This procedure includes coping VM images and its configuration files from virtualized environment to other storage medium, generally outside of the virtualized environment. Later on, this VM image and configuration files can be used to restore the VM image in case of a crash or revert back to the previous state of the VM.
The limitation is that the copied VM image and other configuration files are virtualization-technology dependent, which means the backed up VM image can only be re-provisioned on the same hypervisor. The version of the hypervisor platform may also cause problem in future while provisioning backup VM images. This can happen if there is a change in the format of configuration file or if there is a change in the way the VM images are created by hypervisor.
Storing the virtual machine in open virtualization format (OVF) - The third option is converting the VM into the OVF and storing it as a back up in offline storage outside the virtualized environment. Later, the same OVF can be redeployed to restore the VM. An OVF package consists of several files placed in one directory. Generally, they are VM image files with VM configuration information in XML format.