Will the explosion of data in 2017 be repeated in 2018?

In 2018, it is probable that there will be an increase in organisations taking advantage of object storage for structured/tagged data and it is likely that object storage will create structured/tagged data from unstructured data. Metadata will be used to dig out from the avalanche of data generated by artificial intelligence, and as data volumes explode, organisations will start adopting new economically-friendly storage strategies.

1. Object storage for structured/tagged data, and object storage creating structured/tagged data from unstructured data

The recent past saw an explosion of analytics technology as businesses demanded tools to unlock insights from their vast and growing sets of data. Now that those tools are available, the pendulum will swing back to the demand side of the equation and force businesses to pay more attention to collection, management and storage of that increasingly valuable data.

The amount of data that businesses are collecting is spiking – witness Intel’s statement late last year that the average car will generate 4000 GB of data per hour of driving. But it’s not just the raw data that’s causing the spike.

Emerging data formats are causing a spike, too, as we collect more unstructured data – think of the video captured by a Tesla and shared with the company.

In order to use this data, you have to understand it. That need for rapid understanding is driving a spike in data tagging and an associated spike in the collection of metadata. As data volume increases, the value of structured or tagged data increases disproportionately to the value of unstructured data.

This data is collected by sensors of various types – wear and use sensors built into machines, video and radar recorders and transmitters in cars, and medical equipment that captures patient health information, for example. As that data – and its associated metadata – is leveraged for business advantage, it’s likely to spawn even more data.

As businesses analyse these enormous data sets, they will start to identify additional data types that could lead to further correlations and additional business advantages, which will justify the deployment of more sensors to understand more about the real-world experiences of customers.

The data from these sensors will generate increased need for greater storage capacity and advanced data storage management, which will lead to more insights and more sensors, and on at on.

Businesses are at the beginning of a journey where they will strive to understand more about their customers digitally, and it will increase the strain on IT to keep growing storage infrastructures while preserving the usefulness of the data.

2. AI will trigger an avalanche of data

Business is taking advantage of a range of new technologies – artificial intelligence, high-quality video, Internet of Things, analytics and more. The three things all these technologies have in common is that they are all data-intensive, and they demand ever-greater storage capacity – and they will depend on tagged data to function most effectively.

It does little good to store vast amounts of data if you have no way to access what you need to retrieve, or if you don’t have any idea of which data assets exist in the first place. Metadata is the key to extracting value from the data.

Structured/tagged data are a type of metadata, or model of the data. The metadata and models are a higher-level of abstraction beyond the raw object data and are required for analytics.

Without metadata, the unstructured data captured by data formats like video becomes an unsearchable liability instead of an asset. With metadata, the data can be navigated, analysed, understood and put to use.

A good example of this is in the management of video assets; media and entertainment, surveillance and security, and even automotive uses of video are increasing dramatically.

But it’s not reasonable to expect your employees to watch endless hours of video in search of the single clip you need. Instead, facial recognition software built on AI will be used to sift through tens of thousands of hours of material to tag recognised faces, meaning that when the need arises it’s a simple task to locate just the right clip of clips.

As AI/ML generates and uses metadata and models, systems that can efficiently and effectively manage the metadata and models become critical. AI will become an indispensable tool for finding the most valuable data within enormous data sets.

3. Economics will force businesses to adopt new storage strategies as data volumes explode

Not only is data a tremendously valuable asset, but businesses are creating more of it, faster than ever before. That means that businesses must invest in new strategies for safeguarding and protecting data – but at the same time, they have to store data in economically responsible ways. That’s not as easy as it might sound. Configurable data policies that control how the data are stored once inside a storage system will become critical.

These data policies can control the durability, cost, availability, and other properties of the data according to dynamic optimisation criteria. A simple example is moving data from hot to cold storage. But the optimisation criteria are continuously variable, and can reflect tradeoffs based on business priorities.

For example, a user may want to trade data durability for lower cost. One of the paradoxes of data storage is that the more you store in primary storage, the more per unit of storage it costs. That was bearable when a terabyte was considered a lot of storage, but today most large businesses have multiple petabytes of data under management.

Is it affordable to keep it all in primary storage? Or is it smarter to look for secure archiving combined with advanced search tools to keep data costs down while making sure you can find the data when you need it?

The concept of tiered storage has existed for 20 years, but it was deemed an unnecessary expense as disk storage prices tumbled in the early 2000s. In 1995, storage costs for a GB of data were around $1000.

Within five years, the cost per GB plummeted to less than 10 cents per GB, and today it costs around 3 cents per GB. There was no need to invest in storage management – drive capacity was so inexpensive there was no reason to consider it.

Today, however, the volume of data is flipping the economic formula on its head. Storage prices that once seemed negligible are now looking like less of a bargain since the amount of data stored is so enormous.

These economic factors will prompt businesses to revisit these old strategies in the coming year. Harkening back to the past is easier for a couple of reasons: that archived data is much more quickly available today, and search technology has come a very long way, allowing archived data to be examined as easily as data in primary storage.

Sourced by Gary Ogasawara, vice president of Engineering, Cloudian

Nick Ismail

Nick Ismail is a former editor for Information Age (from 2018 to 2022) before moving on to become Global Head of Brand Journalism at HCLTech. He has a particular interest in smart technologies, AI and... More by Nick Ismail

Will the explosion of data in 2017 be repeated in 2018?

1. Object storage for structured/tagged data, and object storage creating structured/tagged data from unstructured data

2. AI will trigger an avalanche of data

3. Economics will force businesses to adopt new storage strategies as data volumes explode

Nick Ismail

Related Topics

Related Stories

Why ISO 42001 sets the standard for responsible AI governance

7 key strategies for MLops success

Why synthetic data is pivotal to successful AI development

Why AI needs a kill switch – just in case

Related Stories

Why ISO 42001 sets the standard for responsible AI governance

7 key strategies for MLops success

Why synthetic data is pivotal to successful AI development

AI vs AI – are cybercriminals or organisations winning?