Freely subscribe to our NEWSLETTER

Opinion

Retention: which data de-dupe solution?

October 2007 by By Alexandre Delcayre, technical director EMEA, FalconStor

Complex I.T. systems drive storage volumes to grow exponentially, while regulatory regimes require businesses to store and protect data for longer. Although compression technology can deliver an average 2:1 data volume reduction, this is only a percentage of what is required to deal with the data deluge most companies now face. Fortunately, data de-duplication is widely regarded as the Next Big Thing in backup.
Only data de-duplication technology can truly help reduce data volumes and, accordingly, ‘de-dupe’ is fast becoming a ‘required’ technology for any company wanting to optimise the cost-effectiveness and performance of its data storage environment. But many solutions exist, making the choice a complex one. Customers can, however, make the right choice by focusing on eight key factors, and balancing their choice against them.

To choose a cost-effective, high-performance, and scalable long-term data storage, you should concentrate on:

1.Focusing on the largest problem
The largest problem is backup data in secondary storage. The largest problem is backup data in secondary storage. The following graphic, courtesy of Enterprise Strategy Group, 2007, illustrates why a new technology evolution in backup is necessary.

Incremental and differential backups were introduced to decrease the amount of data required compared to a full backup. However, even with incremental backups, there is significant duplication of data when protection is based on file-level changes. Across multiple servers at multiple sites, the opportunity for storage reduction by implementing a data de-duplication solution becomes huge.

2. Integration with current environment
Who wants disruption? Many companies are turning to virtual tape libraries (VTL) as a non-disruptive way to improve the quality of their backup
3. VTL capability
Is your VTL up to the task? If data de-duplication technology is implemented around a VTL, the capabilities of the VTL itself must be considered
4. The impact of de-duplication on backup performance
Where/when does de-dupe take place? Some solutions attempt de-duplication while data is being backed up, degrade VTL performance by as much as 60% over time
5. Scalability
Consider growth expectations over five years or more. How much data will you want to keep on disk for fast access? How will the data index system scale?
6. Distributed topology support
Where is your data? Data de-duplication is a technology that can deliver benefits throughout a distributed enterprise
7. Real-time repository protection
Could your resulting data store possibly be vulnerable? Access to the de-duplicated data repository should not be vulnerable to a single point of failure.
8. Efficiency and effectiveness
How rigorous are you prepared to be? File-based de-dupe yields much less storage reduction than methods that analyze data at a sub-file or block level.

Above all, don’t be seduced by the hype sometimes attributed to the technology. Although the benefits of data de-duplication are dramatic, in the final analysis the amount of data de-duplication that can occur is driven by the nature of the data and the policies used to protect it.
In order to achieve the maximum benefit of de-duplication, choose a data de-duplication solution based on the total set of requirements - not just the biggest theoretical data reduction ratio you hear about.

Subscribe

Freely subscribe to our NEWSLETTER

See previous articles

See next articles

Security Vulnerability

Toutes nos news en Francais

Alle unsere News auf deutsch

Your podcast Here

All new podcasts

Global Security Mag Copyright 2011