DIY or buy? Weighing up infrastructure options for big data

Building big data capabilities from scratch seems tempting to many companies. They see it as a way to tailor the technology to their specific business needs and to give their IT department complete oversight and control of the processes and capabilities.

There is some truth in this, but when it comes down to it, this approach is fraught with challenges that could mean business goals take a long time to be delivered, are only partially met, or aren’t achieved at all.

It’s the age-old dilemma for IT departments: how can you do more with less? And this is particularly true when discussing big data, which for many companies is a new set of capabilities.

When tackling big data for the first time, most IT departments will have little experience or expertise to tap into. This makes the task of building capabilities with technology from a range of vendors all the more challenging.

And it’s not just building the infrastructure – it’s also about evaluating, testing, developing, integrating and tuning. This all takes time and, with the added issue of a small knowledge base, will take even longer.

The issue of expertise also applies in the longer-term. It’s likely that the people who worked on building the big data capability will no longer be in the team after a few years. This means the expertise that has been built up has gone when organisations want to evolve what they’re doing with big data.

There may also be a basic lack of resources to devote to the task, meaning the project takes longer to complete, delaying the business benefits that big data is meant to deliver – also known as ‘time to value’.

And if there are problems with the implementation of the new technology, this could lead to further complications and delays that will require numerous hours and investment to correct.

For example, poor network performance can often take months to solve when taking the DIY approach, especially when taking into account the multiple vendors involved. With the pre-built approach, a single vendor will take responsibility to track down and fix the issue much more quickly.

For many businesses, the ‘buy’ approach will be the better option. By investing in a suite of big data technologies packaged together, either deployed on-premise or via the cloud, companies can address all of the time to value challenges.

The technology is already tested, integrated and optimised for the task, with the vendor providing the expertise and support that may be lacking in the IT department, and which will evolve with the technology.

And while even well-thought-out DIY implementations take months to make production-ready, in theory, some big data appliances can be up and running on-premise in a matter of hours.

Counting the pennies

Some organisations may see the DIY approach as a way to save money, as they can seek the best deal for each component of the stack they are building. This may also be in the belief that they are paying a premium for the work that a vendor puts into packaging the technology for an engineered system.

But, according to research by the Enterprise Strategy Group (ESG) and commissioned by Oracle, taking the pre-built approach when ramping up your big data capabilities is likely to save you money, and a significant amount at that.

For a medium-sized Hadoop-oriented big data project, ESG found that a pre-built system could be around 45% cheaper than the DIY equivalent.

By taking the ‘buy’ approach, Belgian media group De Persgroep was able to deploy its big data project in a mere three months. Its big data appliance also proved to be more cost-effective than an internally built Apache Hadoop cluster, which would have required multiple servers and software licences, as well as greater maintenance resources.

De Persgroep analysed customer behaviour, such as website interactions and payment behaviour, so that it was able to predict subscription churn for its newspaper business with an accuracy of 92%.

Future proofing

Such is the speed of development of open-source big data technologies – in order for organisations to continue to be at the cutting edge of big data technologies, they will continually need to re-evaluate and integrate new open source projects whilst delivering enterprise grade platforms and services.

For example, there is currently a move towards the Apache Spark cluster computing framework. This shift means a significant migration and integration activity for Hadoop users to ensure the most relevant technology is being used.

Pre-built systems avoid many of the issues presented by organisations building their own big data capabilities – while also bringing a host of additional benefits.

Building big data systems is not what gives value to businesses – it’s the value gained through the use of analytics. By choosing the pre-built route, businesses can slash the time to value, save money and future proof their capabilities.

Sourced from Vicky Falconer, big data solutions lead, Oracle Australia

Ben Rossi

Ben was Vitesse Media's editorial director, leading content creation and editorial strategy across all Vitesse products, including its market-leading B2B and consumer magazines, websites, research and... More by Ben Rossi

DIY or buy? Weighing up infrastructure options for big data

Counting the pennies

Future proofing

Ben Rossi

Related Topics

Related Stories

How do you build an adaptable data platform?

Charting the AI-fuelled evolution of embedded analytics

Data maturity and the squeezed middle – the challenge of going from good to great

How to stop data mesh turning into a data mess

Related Stories

How do you build an adaptable data platform?

Charting the AI-fuelled evolution of embedded analytics

Data maturity and the squeezed middle – the challenge of going from good to great

Looking at the Earth with fresh eyes