Q&A with Juan Orlandini of Datalink

By Juan Orlandini (Profile)
Share |
Thursday, December 22nd 2011
Advanced

VSM: What is deduplication and what part does it play in data storage?

JO: Deduplication is the automated removal of redundant data in a storage system. As a reduction technique, deduplication greatly increases the efficiency of storing information. It can be used in a number of different ways and places: networking, primary storage, backup, and for data archival protection.

VSM: How has the need for deduplication changed in recent years?

JO: The need hasn’t really changed. What’s changed is the maturity of the product offerings. Using these more mature solutions, organizations have been able to leverage deduplication technology for a number of different needs.

Originally deduplication technology was used as an alternative to tape for backup and disaster recovery. This use case continues today, and has become a predominant technology for data protection. It’s actually evolved—for example, networking vendors have implemented a version of deduplication backup and recovery to greatly enhance the efficiency of transmitting data over Wide Area Networks. Storage vendors have also recognized the efficiencies available to them through these technologies and implemented them in their storage arrays. They’re offering deduplication as a way to both improve the available capacity, and optimize data transmission when replicating data.

Deduplication technology is not restricted to hardware vendors either. Software vendors are now offering deduplication as a more efficient mechanism for performing tasks like remote office backups or even data center client backups. For anyone interested, there is additional information on this topic in my blog.

VSM: What kinds of business problems can deduplication solve?

JO: Our customers are leveraging deduplication to solve a great variety of problems. In the data protection space, customers face increasing pressure to offer faster backups, even faster restores, and to do them with fewer resources than in the past. Data protection solutions that offer deduplication can, at the very least, significantly reduce the cost of protection to disk—often by more than a 20x reduction.

However, and more importantly, recovering lost information from these solutions is dramatically faster as well.  A properly architected data protection solution that leverages deduplication can often either completely eliminate tape, or relegate tape to an archival medium. Many of Datalink’s customers are now able to replicate all of their backup data from one site to another. This eliminates the need for third-party tape handling and greatly improves the recoverability of their data.

Similarly, customers who have implemented deduplication in their primary storage arrays have been able to drastically reduce their storage requirements. With these technologies it’s possible to achieve huge storage efficiencies, particularly in virtualized environments. Regardless of the hypervisor a customer chooses—VMware, HyperV and so on—there is a huge amount of commonality between each of the virtual machine instances.  Arrays that do deduplication are able to identify these commonalities and factor them out. We have seen customers that have been able to reduce their storage requirements by over 80% through deduplication.