Dedupe, Yes. But Where? And How?

By Arthur Cole | Oct 14, 2010 | Print this Page
http://www.enterprisenetworkingplanet.com/datacenter/datacenter-blog/dedupe-yes-where-and-how
"Those slippers will never come off, as long as you're alive," said the Wicked Witch to Dorothy. "But that's not what's worrying me. It's how to do it. These things must be done delicately, or you hurt the spell."

One of the more chilling phrases from The Wizard of Oz neatly sums up the dilemma that many enterprise executives find themselves in these days. Most can see the dramatic advancements in data center capabilities that are within reach, but they nonetheless must tread carefully, delicately, through a host of options, lest they end up damaging that which they hope to improve.

Take deduplication as an example. There's little debate anymore as to the efficacy of the technology. Done right, it dramatically lessens that data load in storage and networking architectures, freeing up increasingly valuable resources. Done wrong, it could hamper users' ability to access the very data that is vital to productivity and overall organizational prosperity.

One of the biggest debates is between source and target dedupe. Source solutions, like CommVault's Simpana 9, perform the deduplication process right at the client, which means less data not just in storage but over the network as well. In general, source solutions are more expensive because you need to place dedupe engines at more locations. However, CommVault's all-software approach minimizes this problem by installing backup engines on application servers.

More centralized systems may leave the network largely out of the equation, but they make up for it with impressive performance at the storage array. Quantum's DXi8500, for example, acts as a virtual tape library with a 200 TB capacity and can dedupe 6.5 TB of data per hour. The system has been bumped over its predecessor with RAID 6 data protection and an increase from two 4 G Fibre Channel ports to six 8 G ports, plus two 10 GbE and four 1 GbE ports.

With performance like that, it's no wonder all the major storage firms are not only investing in dedupe, but are figuring out ways to make it faster, says Drew Robb on Enterprise Storage Forum. Data Domain's Boost, for example, shifts the dedupe process closer to the backup media server, freeing up both server resources and bandwidth needed by the Data Domain appliance. EMC customers will soon see Boost bundled into the NetWorker backup stack, offering control of the Data Domain infrastructure from the NetWorker console.

In truth, says Spectra Logic's Kevin Dudak, all the major dedupe platforms are effective at their primary function: reducing data loads. However, each one has a distinct personality that can vastly affect how it integrates into different environments. Spectra's nTier, for example, is a VTL-based appliance, so it fits in well with tape-based backup and archive systems. A product with a NAS interface, meanwhile, would be more at home in an environment with a lot of production data even though performance may lag elsewhere.

That's why it's important to do your homework before committing capital to such a vital data center function. Vendors will no doubt give you the hard sell as to exactly why their system is the perfect fit, but only through careful evaluation will you be able to make the call for yourself.