Intelligence Elevates Deduplication Importance in Data Protection Strategies - Executive Viewpoint 2013 Prediction: FalconStor Software

Paul Kruschwitz (Profile)
Thursday, January 10th 2013

With data volumes doubling annually, companies are scrambling to store, manage and protect it all. Deduplication technologies are one of the weapons in the IT arsenal that reduce the amount of overall data sent to backup appliances. By eliminating copies of redundant data at the backup, or secondary storage level, companies can reap the benefits of costs savings and efficiency. Yet, deduplication is viewed within the storage industry as a standard storage appliance feature, a “simple” component. Deduplication deserves the spotlight as an integral aspect of the modern storage strategy. It is an intricate application that involves a number of resources, processes and attention from IT staff. Looking ahead in 2013, deduplication will become the darling of the data center for its intelligence and global reach in storing and protecting all forms of data.

Originally introduced to the market to alleviate the overwhelming, costly backup burden companies were facing, deduplication solves two major data protection issues associated with the generation of multiple copies of data and inefficient storage. With deduplication, companies may, if they choose, cost-efficiently backup information to disk and replicate information to an off-site data center for longer term and archival storage.

Deduplication technologies have made great strides in recent years, perfecting how duplicate data is found on the network. These deduplication advancements are stated through greater deduplication and data compression ratios. Improvements in effectiveness  were made to the compression algorithm by increasing the size of its buffer, allowing for the identification of duplicate data across larger samples, enabling the data to be represented with less overall information. The deduplication solution stores all the meta-data, otherwise known as hash values, for the data in its repository and not just that of the last full buffer. The effectiveness of the deduplication solution is tied to the size of repository used. Finally, once compared and deduplicated, the unique data is run through a compression algorithm before it is stored. The deduplication performance is reported as a product of deduplication and compression.

Deduplication is now standard for all size companies as vendors are including it within the all-in-one storage appliance or backup software solutions. IT professionals have to be careful in selecting these solutions, as the deduplication performance they hope to achieve may not be realized. The solutions that have deduplication added in as an afterthought are limited in terms of performance, scalability, reliability and integration with archival tape systems. In fact, over the years, vendors that architected these smaller scale, appliance-based solutions downplayed how they integrated with tape. Many companies – including those in the financial and healthcare industries – have strict data retention requirements. Due to the higher costs of disk, they still need backup solutions that allow data to be exported and stored to tape. As we head into 2013, deduplication solutions must provide a method for companies to continue their use of tape for archival purposes.