Big data, Einstein and the definition of madness

Firms are having trouble doing 'more with less', regarding data analysis, and taking advantage of the big data boom.

Einstein once said ‘Insanity is defined as doing something over and over again and expecting different results’. Now, while there’s some debate about whether Einstein did or did not give us this well worn aphorism, there’s little doubt that whoever did come up with it was NOT talking about big data! And yet, it’s an entirely appropriate observation to make about the state of the big data industry right now.

Whatever the nature of a business’s analytics environment, in most cases it will live up to the old adage of delivering more with less; more insights, more efficiency and more simplicity and all while reducing your cost of investment. And to exacerbate matters, the growth of data that needs to be managed is exploding. According to Gartner & IDC, the volume of data is currently at 2 trillion GB and doubling each year; stated to reach 40 trillion GB by 2020.

Now, if your analytics is based on SAS, you will probably have to significantly increase your investment each year to perform the same analytics on double the volume of data.

It’s expensive and time-consuming to manage all that duplicate storage, data centre-space and network infrastructure for moving data as well as the need to increase processing power. Even with all of that, the analytics may only be performed on subsets of data and the time-to-insights will not live up to your business requirements.

Traditional approaches to analytics like SAS force you to move data from where it is stored to separate analytics servers because the data is in one place and models run in a different place and then feed the results back into the database.

This results in huge pain points, including:

• Expensive hardware
• Slow insight delivery as two thirds of the time is often spent moving the data
• Sub-par model analysis – due to memory constraints of the analytics servers, models must be built with only what fits into memory, and not the entire data
• Outdated analysis – in several industry verticals, the underlying database might change rapidly versus the snapshot moved into memory on the analytic servers

And there’s another problem; the better the data quality, the more confidence users will have in the outputs they produce, lowering the risk in the outcomes and increasing efficiency. The old ‘garbage in, garbage out’ adage is true, as it its inverse. When outputs are reliable, guesswork and risk in decision making can be mitigated.

But here’s the rub, rather than stepping back and asking, how does one break this cycle, many organisations are like the proverbial hamster on the wheel; just trying to run faster and faster. And, like the hamster, not getting anywhere. Fast.

Rather than simply doing the thing they have always done and hoping for a different outcome (because, after all, we do know that is the definition of madness), they need to stop and take a different tack. They need to start with the objective of achieving scalability at speed for their analytics at a lower cost and consequently, look for a different approach.

Fuzzy Logix turned the problem on its head with its in-database analytics approach – moving the analytics to data, as opposed to moving data to analytics, and eliminate the need for separately analytics servers.

A real world example; a large US health insurer, moving the data out of the database to the SAS servers meant breaking it into 25 Jobs and assembling the results – a process that took over 6 weeks! Using in-database analytics allowed the customer to work on the entire dataset at once, and finish the analytics in less than 10 minutes.

So, if you want to significantly accelerate your data analytics capabilities and see your business achieve phenomenal performance jumps, with potential huge cost and resource savings, take heed of the message from old Albert. And stop doing whatever you’ve always done. Take a different approach. You never know, you may just return a bit of sanity to your big data strategy!

Sourced from Fuzzy Logix

Nick Ismail

Nick Ismail is a former editor for Information Age (from 2018 to 2022) before moving on to become Global Head of Brand Journalism at HCLTech. He has a particular interest in smart technologies, AI and... More by Nick Ismail

Big data, Einstein and the definition of madness

Firms are having trouble doing 'more with less', regarding data analysis, and taking advantage of the big data boom.

Nick Ismail

Related Topics

Related Stories

Observability – everything you need to know

Why data isn’t the answer to everything

Two-thirds of UKI firms struggling with data insight costs

Qlik completes acquisition of Talend

Related Stories

Observability – everything you need to know

Why data isn’t the answer to everything

Two-thirds of UKI firms struggling with data insight costs

What generative AI means for business analytics