Making the Case for Cloud Storage
The cloud is rapidly emerging as an ultra-low cost, flexible and highly scalable storage resource, and it's this type of usage, rather than running applications in the cloud, that's catching the interest of many enterprises.
The attraction of cloud storage is not hard to understand because it can be used in a variety of ways. The most popular ones are:
Overflow capacity - Cloud storage can provide short term storage capacity to cope with occasional spikes in demand for storage, or to provide storage as an interim measure before new on-premises storage devices come online. For performance, network latency and reliability reasons it is rare for this type of overflow capacity to be used to store current data being processed by applications.
Archive storage - Cloud storage provides a lower-tier storage option for older data that organizations may need to retain for many years for compliance purposes or for future analytical purposes, but may rarely or never be accessed. Moving this type of data to cloud storage frees up more expensive tier-one storage resources for everyday use by on-premise applications.
Backup and disaster recovery - Cloud storage provides an ideal off-site location to store backups, eliminating the need for tape handling. Disaster recovery (DR) can be achieved by sending backup data back to the original site, to an alternative DR site, or even by spinning up virtual machines at the cloud storage provider's data center and using the backed-up data in the cloud.
Of these use cases, the most popular is probably backup and disaster recovery, according to Laura Dubois, an IDC analyst. In particular, it is popular with smaller firms with less storage capacity or larger firms backing up their remote branch offices and endpoints, she said. The key attraction is that it removes the need to have backup tapes collected and stored in secure vaults offsite.
"Firms need to pay a third party to transport the tapes to the secure vault and to store them. The cost is not trivial," Dubois said. "Then there is risk of compromise to the tapes with the media changing hands. And if you need to recover, the tapes need to be found and transported to the place of recovery. Depending on where they are stored that could take hours, be done overnight or take days."
Cloud storage gateways
To use cloud based DR, many companies are looking to cloud storage gateways: software or appliances, either physical or virtual, which translate cloud storage APIs to block-based storage protocols like iSCSI and Fibre Channel or file-based interfaces like NFS and CIFS. With a cloud storage gateway installed on premises, they can continue to use their existing backup applications, using the gateway as a target.
Riverbed's Whitewater cloud storage gateway is designed exactly for this purpose. Targeted at medium sized enterprises, a Whitewater appliance (either physical or virtual) talks to a data protection or backup server and accepts the backup data. The appliance then carries out deduping, compression and encryption before caching the most recent backups, and replicating the data to a cloud storage provider such as Amazon. The deduping and compression leads to an average reduction in the data that needs to go to the cloud of about 25:1, or about four percent of the original data, according to Ray Villeneuve, Riverbed's Whitewater general manager.
In the event that data needs to be restored, the relevant data will often be found in the Whitewater appliance's cache. If not, it can be restored from the cloud through the Whitewater. For DR purposes organizations can use a standby Whitewater appliance at an alternative site or simply download and bring one up as a virtual appliance. They can then begin to download data or mission critical subsets of the data to "rehydrate" the replacement Whitewater appliance, which then passes the data back to the backup application.
Villeneuve sees this use case as the most natural for Whitewater appliances at the moment, but expects them to be used for archiving more in the future. "Our appliances are optimized for back up [of] a small number of large files. But we are aware of the archiving use case; a workload characterized as more random I/O. In the future, we will add features to enable that, although we have some customers using it for that purpose already."
Dubois agreed. "The majority of mid-market customers still do their archiving/long term retention of data through their back up application policies. So, yes this would be feasible. But firms need to factor in the recurring service costs of keeping data in the cloud versus the capital and op-ex costs of keeping it onsite. "
Sliced and diced
Cleversafe, a Chicago based storage company, developed another type of appliance that can be used as a cloud storage gateway. The company's appliances use storage slicing technology to divide up blocks of data, encrypt it, and then store it in different locations. The system uses variable rates of redundancy, so that, for example, only 10 slices out of 16 may be required to reassemble the data.
The implications of this for companies with large data archives are important. Using Cleversafe's technology it is possible to store multiple copies of huge data sets by distributing the slices across a cloud service provider's different data centers. And since redundancy is built in to the slicing mechanism, increasing the total volume stored by a factor of 1.6 in our example (because 10 slices out of 16 are needed to recreate the data) results in what is in effect multiple copies of the data set. If a company has 50 petabytes of data to be stored, for example, the potential savings in storage volumes (and therefore costs) can be huge.
"Our model is cost effective with storage capacities of a petabyte or above," said Russ Kennedy, Cleversafe's Strategy vice president. "Normally, companies would have to make a second copy of their data, which is very expensive. We offer high reliability with one instance of the data."
The system is also highly scalable, which appeals to cloud storage service providers, and although Cleversafe currently only sells direct to enterprises for use in private cloud environments, Kennedy said the company is currently in talks with cloud providers about offering Cleversafe's technology for cloud storage services.
For cloud storage to work there are several considerations that need to be taken in to account. One is the physical location of data stored in the cloud, because regulations apply in some countries or industries that dictate where data must be stored. Using a cloud storage provider that replicates data for redundancy purposes in different continents might easily breach these regulations.
There's also the issue of WAN bandwidth to the cloud. This doesn't always come cheap. Riverbed recommends a 50Mbps to 100Mbps WAN connection as a minimum.
The fine print
IDC's Dubois warned that the devil can be in the detail, and recommends that special consideration be given to SLAs, cloud contract terms, pricing, certifications, independent audits, and, of course, security -- both at the physical and logical level.
As cloud storage technologies become more developed, new use cases will emerge, Dubois believes. "An example would be the opportunity for a file sharing/sync/collaboration cloud storage service for businesses. Like an enterprise version of DropBox," she said. "Something that can bridge both public and private cloud content is really the ideal."
At the moment the cloud storage gateway market is quite small, and key vendors include:
Cleversafe - software or software and appliance-based system based on storage slicing technology. Not yet offered as part of a cloud storage solution.
CTERA - "Cloud attached storage" appliances aimed at SMBs, branch offices and SOHO environments. These combine on-premise storage infrastructure with cloud storage from Swisscom, OpSource, Amazon S3, Rackspace, Dimension Data, VISI, petaera and Quadria .
Gladinet - CloudAFS software file-server interface to the cloud is licensed on a monthly basis. Cloud Desktop client software working with CloudAFS software sees cloud storage providers including Windows Live SkyDrive, Google Docs, Amazon S3, EMC Atmos–based providers, and AT&T Synaptic Storage as local disks
Nasuni - The Nasuni Cloud File Server virtual appliance holds data in a local cache and replicates it to public cloud storage providers including Amazon S3,AT&T Synaptic Storage, Nirvanix, PEER 1 Hosting, Rackspace Cloud, and Microsoft Windows Azure. Pricing is based on a flat monthly charge
Panzura – Panzura's Alto Cloud Controller appliance or virtual appliance works with cloud vendors including Nirvanix, Amazon S3, Microsoft Windows Azure, EMC Atmos, and AT&T Synaptic Storage as well as CDN networks such as Limelight. The appliances support NFS, CIFS and HTTP, and can carry out file locking, versioning, snapshots, compression, deduplication and local caching.
Riverbed -Whitewater appliances and virtual appliances work with Amazon S3, AT&T Synaptic Storage, and Nirvanix and support for all major backup tool platforms, including NetBackup, Backup Exec, TSM, Quest vRanger, NetWorker, and ARCserve, requiring no change or rewrite of existing applications.
StorSimple – StorSimple's appliances are targeted at primary storage workloads with application plug-ins for SharePoint, Exchange and shared drives, using cloud storage from AT&T Synaptic Storage, Amazon S3, EMC Atmos–based partners, and Microsoft Windows Azure.
Twinstrata – Twinstrata's CloudArray appliances and virtual appliances connect to cloud storage from PEER 1 Hosting, Amazon S3, Windstream Hosted Solutions, EMC Atmos–enabled clouds, and AT&T Synaptic Storage and appear as iSCSI volumes.
Paul Rubens has been covering IT security for over 20 years. In that time he has written for leading UK and international publications including The Economist, The Times, Financial Times, the BBC, Computing and ServerWatch.