The rules around how data is stored, architected and queried are changing dramatically. With the Internet of Things (IoT), artificial intelligence (AI) and analytics driving the trend for big data, it’s expected to represent 30% of data stored in data centres by 2021.
What’s key is that the majority of this data is unstructured; Gartner predicts around 80%. Traditional databases, designed pre-internet, are unable to handle the type and volume of data being created. As a result, gone are SQL queries and relational database systems.
These traditional systems are built around the concept of structured data tied to very specific and narrowly focussed applications in limited domains. The amount of data and the scope of use of that data in these traditional systems is also limited, greatly reducing their viability in the world of unstructured big data paradigms. So, what’s next?
The need for better storage
Current infrastructure for storing data is horribly inefficient. The reliance on third-parties for things like cloud storage (which many enterprises now do) raises a myriad of issues: not only is scaling expensive, but performance is significantly hindered when dealing with large amounts of data. Aside from the technical limitations imposed, there are privacy risks when trusting a service with information (particularly that which is sensitive) and we’ve seen an increasing amount of data breaches occur within centralised systems.
Businesses should be looking at alternative methods of storage – they’re in possession of more data than ever before (2.5 quintillion bytes of it are produced daily). And that data is wasted if it can’t be used to derive insights that can be leveraged to target a wider audience and increase revenue.
For storing data in this rapidly developing climate, it’s imperative that more resilient and efficient databases are created. They need to be highly secure and adept at catering to the needs of applications in the fields of IoT and AI. To these ends, I believe that blockchain technology is an ideal solution.
Distributing data with blockchain
Instead of being run by an entity such as Amazon or Google, blockchain’s integrity is assured by nodes in the network that sync copies of the database. From a security standpoint, this is incredibly hard to compromise, requiring that a party gain control of the majority of the nodes to be able to alter the entries on the ledger. Since the nodes are distributed and operate peer-to-peer, the possibility of bottleneck formation is nonexistent. One of the most important features of blockchain systems, however, is immutability: once an entry is appended to the database, it cannot be removed.
Using blockchain for databases seems like a logical step forward. There’s definitely an emerging movement seeking to lay the foundations for a decentralised architecture across industries. With blockchain, a marketplace akin to AirBnB or Uber can materialise for storing data – nodes on the network can be incentivised to replicate and retain information using a blockchain protocol’s inbuilt payment layer.
This concept can be taken a step further with the use of sharding and swarming. Sharding offers a greater degree of privacy whereby, instead of sending a file to other nodes, you distribute fragments of said file. In this way, the owner can be sure that those in possession of their data cannot access it, as they will only hold a small (and unreadable) piece – much like torrenting.
Swarming, in the context of data, is a concept that divides the network up into clusters of nodes, based on their geographical location. This is vital to ensuring that the network can handle high-throughput around the clock. Nodes in a swarm can pull data from those closest to them (reducing latency), or download shards in parallel from multiple sources for incredibly fast retrieval. The same shard is replicated and stored across several nodes, so that, if one goes down, the data is still accessible – unlike centralised servers, which need to be taken offline every so often for maintenance, a distributed system remains constantly functional.
A new infrastructure
AI requires huge volumes of data, in order to train neural networks to perform better in a range of industries. Big data analytics, as the name might indicate, are based on crunching large data sets.
There’s a common theme here: a need for an infrastructure to contain a lot more information than we’re used to storing. The archaic pre-internet databases that we’ve used until now are clearly failing to keep up. It’s time to use blockchain to create new methods: stronger, more secure, faster and scalable databases to truly drive innovation.
Sourced by Pavel Bains, CEO, Bluzelle
Nominations are now open for the Women in IT Awards Ireland and Women in IT Awards Silicon Valley. Nominate yourself, a colleague or someone in your network now! The Women in IT Awards Series – organised by Information Age – aims to tackle this issue and redress the gender imbalance, by showcasing the achievements of women in the sector and identifying new role models