Ten years ago – maybe longer – data deduplication promised to be one of the hottest trends in storage technology. Most technologies, hot or not, pass their prime, but dedupe is still going from strength to strength.
Analyst house Gartner has suggested that deduplication backup target appliances are even more active now than before, and go much further than simply allowing organisations to replace tape backup with disk.
Most organisations have some kind of dedupe as part of their data protection strategy to remove duplicate data, reduce the amount of data they’re backing up and, in turn, save them money. But there’s more than cutting costs and overprovisioning behind the staying power of dedupe.
Things are changing and deduplication is not just a band aid to a storage-hungry backup infrastructure. Today, modern deduplication has a direct impact on a company’s operational efficiency, and enhances its ability to backup and recover data. Legacy approaches worked well for a long time, but things have changed.
Massive data growth is another driver that has fuelled this evolution: who knew ten years ago that we’d be regularly storing terabytes, even petabytes, of data? Look at any phone today and compare it to a laptop’s storage capacity and memory of a few years ago. More data means more storage, and even more storage for backups. Most people working with data say it’s the most vital part of a business, and they’re right.
These terabytes of data invariably include information you need to keep, whether by law, for service level agreements or simply to have a good understanding of the business. It makes sense then to strip out duplicated data, so that you have space to store crucial information, and the capacity to recover it in real-time if something goes wrong.
But it’s not just about the amount of data we’re creating – it’s the fact that it’s very difficult to predict how that volume is likely to grow over the next 12 months, let alone five years.
By choosing the right deduplication, organisations can add to their current system as and when they need to. That scalability is crucial for cutting costs and keeping backed-up data under control.
That’s where different types of dedupe come in. Not all solutions are the same and not all are suitable to every scenario. And that’s why users are taking a much more strategic approach to their dedupe strategy.
This move represents quite a shift in thinking: away from a one-size-fits-all approach towards investigating what is needed, looking closely at the workload and selecting the best ways to backup.
Target deduplication is the legacy approach. It has done well and has seen broad adoption but it has become limited in its efficiency. Think back to the days of terabytes and petabytes – then imagine backing up every single piece of that data during a finite backup window. That’s how it works with target dedupe, and that’s why this kind of deduplication struggles in a larger environment.
>See also: Is the future of infrastructure converged?
This is one of the reasons behind the growth in global source-side deduplication. It backs up data at source, shares the de-duplicated store across all nodes, and effectively allows end-users to complete backups faster, reduce the bandwidth needed for backups and improve client performance. And it offers the scalability and cost benefits that many businesses need as it shrinks the backup storage footprint.
Dedupe’s constant evolution keeps the technology fresh, and part of the backup and recovery conversation, more than a decade after it became pervasive.
As long as data volumes continue to grow and as long as dedupe technology manages to adapt to new demands and challenges, it will be high on the agenda in the future too.
Sourced from Christophe Bertrand, Arcserve