Mass data fragmentation: take control of ‘bad data’

Data is undeniably one of the most valuable resources for organisations today. As a key business decision driver across industries, understanding data, and data analytics, in particular, is often crucial to success. Modern businesses are adopting technology to organise and comprehend the huge amounts of information they now collect. But when the data with which big decisions are being made is ‘bad’, or corrupt because suitable governance and cleaning techniques aren’t in place to guarantee the quality, what does that mean for data-driven businesses?

Recent findings from Experian highlight that customer experience ambitions are being challenged by this so-called ‘bad data’. Organisations suspect that almost a third of their data is inaccurate, and 70% said they don’t have the direct control they need to impact strategic objectives. Incorrect ownership (69%), lack of trust in data (49%) and information overload (65%) are the three most common factors preventing businesses from using data to their advantage.

In short, business decisions are being compromised by what they view as poor data quality. The question is, what is driving that deficiency in quality?

The key driver of data mistrust

It’s important to note that a lack of trust in insights or ‘bad data’ aren’t necessarily a result of the data itself, but how it is being managed and collected. While these findings were certainly sobering, they were not necessarily surprising. Historically, businesses have been slow to tackle data quality issues, instead preferring to endure pains and fix issues reactively. In fact, the data phenomenon called mass data fragmentation has been coined to encapsulate the current driver of data quality issues, namely data fragmentation en-mass.

This refers to data that is either siloed, scattered or located in multiple copies all over an organisation’s IT system, leading to an incomplete single view of the data, its components, and an inability to extract real value from it. These data sets are typically located on secondary storage, used for backups, archives, object stores, file shares, test and development, and analytics. What’s more, this is the vast majority of business data – around 80%.

Experian study: why organisations think they have bad data

How bad is bad data? Well, considering, in the past, it has caused major companies to go bankrupt and started wars; not to mention, according to a recent study from Experian, it’s ruining customer experience, we’d say ‘pretty damn awful’

However, when fragmented – as is often the case – it can be extremely difficult to locate, manage or put to any use. So, it’s really no wonder that Experian’s research revealed so many organisations that suspect much of their data is inaccurate, ‘bad’ or difficult to control. They are probably not far wrong.

Research of 900 IT leaders by Cohesity showed that many business leaders view their secondary data as a very expensive storage bill, an unending management headache, a growing compliance risk and even a threat to morale in IT. Both sets of research demonstrate that a lack of control around data ownership will impact strategic ambitions, particularly around customer experience, but also agility, growth and competitiveness. It seems obvious that businesses that can’t get in front of mass data fragmentation, and eradicate such data quality issues, face serious disadvantages that may jeopardise success for years to come.

Bad data: not just an IT issue

There are three reasons for this. Firstly, from a CIO’s point of view, the inability to manage and harness insights is a big competitive disadvantage when it comes to customer satisfaction and development of products and services. Secondly, the inability to know your data contents and its location can cause compliance vulnerabilities and security risks. Since GDPR legislation came into play this has been potentially perilous – particularly if you’re dealing with sensitive data such as financial or health records. And thirdly, struggling with and managing fragmented data is a drain on time and resources that could be better spent elsewhere.

The problem of bad data and mass data fragmentation is not only an IT concern: it’s a business one. Experian’s research highlighted that 75% of data practitioners think responsibility for data should lie across multiple departments, with occasional help from IT. But only 13% of businesses across the UK are currently implementing technology to assist with this. If IT is expected to manage all the organisation’s secondary data and apps across all locations, but technology isn’t in place to accomplish that goal, IT leaders will understandably be worried about a wide array of major problems occurring in several different areas.

Managing challenges of scale, speed and personal information in the big data era

Dan Linstedt, the inventor of Data Vault modelling, explains how to manage the variety of challenges in the big data era

Cohesity’s research showed that as many as 38% of IT leaders fear massive turnover of the IT team, 26% fear they (or members of their team) will consider quitting their jobs, 43% fear the culture with the IT team will take a nosedive and 42% fear employee satisfaction and morale will decrease. More widely, Cohesity’s research found that over 90% of senior IT decision makers said that if they could reallocate just half of the resources they spent managing their organisation’s secondary data to other business priorities, it would have a big revenue impact over a five-year period.

How to mitigate a ‘data disconnect’ in your business

While technology plays a key role in data management and the improvement of data quality, changes in working processes and behaviours of employees are critical too. Experian recommends that for companies hiring a Chief Data Officer to aid with data maintenance problems, it is key to implement the right strategies to reinforce both data compliance and security, along with allowing quick access to data for immediate business use.

Similarly, if a solution is deployed that gives IT ways to directly tackle mass data fragmentation, the architect of poor data quality, rewards for the company could be very significant, but it needs to be measured and analysed.

Why do big data projects fail?

Every business, every CIO is talking about enabling a big data project, but does anyone actually know what one entails?

The solution, of course, would have to encompass a better way of storing, managing, protecting and extracting value from the wide-ranging pools of secondary data. Breaking down business silos, reducing copies, and using technology to help give access to data, without everyone needing their own copies – slimming down storage requirements as a result, a win for IT and the departments it serves. But this is as much a cultural and business process issue as it is a measurement or technology one.

At the end of the day, reliable data doesn’t just provide a platform for making better business decisions, it can earn you a reputation as a reliable and trustworthy business. In the current business world, that is priceless.

Written by John Lucey, UK Country Manager, Cohesity

Editor's Choice

Editor's Choice consists of the best articles written by third parties and selected by our editors. You can contact us at timothy.adler at