DataOps: getting data right for DevOps

DataOps is an emerging methodology that combines DevOps with data scientists and data engineers to support development.

It’s sad but true, most attempts by companies to leverage data as a strategic asset fail. The challenge of both managing vast amounts of disparate data and then distributing it to those who can use it to drive value is proving incredibly difficult. The bad news: data is continuing to grow at a rapid rate and modern application development is getting more and more complex. As such, many organisations are recognising the need for a new paradigm. Enter: DataOps.

Despite the term DataOps being somewhat new, it’s already been deemed as a top trend in enterprise data for its potential in enabling data transformation and allowing companies to utilise data by leading research firms, including Gartner and 451 Research.

What is DataOps?

If you’re the kind of person that’s quick of the mark, you’ll probably have noticed that DataOps sounds a little bit like DevOps; of course, this is no coincidence. The term DataOps was inspired by DevOps, which combines software development and information technology together to build and deploy software products faster. However, DataOps is the combination of DevOps teams and data engineers (get it?).

If you still don’t get it, Matt Aslett, research vice president, data, AI and analytics, 451 Research, defined DataOps a little bit better, he called it “the alignment of people, process, and technology to enable more agile and automated approaches to enterprise data management in service of business goals.”

He added: “It aims to provide easier access to enterprise data to meet the demands of various stakeholders who are part of the data supply chain (developers, data scientists, business analysts, DevOps professionals, etc.) in support of a broad range of use cases.”

The ultimate guide to DevOps: everything an enterprise needs to know

This month, Information Age has been exploring DevOps: the practice that combines software development and IT operations to speed up the delivery lifecycle, while improving quality

If you’re still unsure, it’s probably because one singular and definitive definition of DataOps is simply hard to come by. This is because DataOps is somewhat of an umbrella term denoting to the development and roll-out of data analytics pipelines.

In simple and broad terms, DataOps is all about encouraging organisations to tackle data initiatives, such as testing data pipelines, in a similar way to how they test software and application development.

Understanding the DataOps landscape

In March 2019, 451 Research released a report, commissioned by Delphix, entitled ‘DataOps Lays the Foundations for Agility, Security and Transformational Change‘, which revealed how enterprises around the globe are planning to invest DataOps over the following 12 months.

According to the research, 86% of organisations surveyed were planning to increase their investments in DataOps, while 92% were confident their strategy would have a positive impact on their organisation’s success.

Related: What is DevOps? A complicated principle with transformational outcomes – What is DevOps? You would think the answer is straight-forward, but it’s not. And adopting it requires quite dramatic changes to an organisation’s culture.

Compliance and security were notable call-outs in the report, as nearly three-quarters of respondents cited security and compliance as a top perceived benefit of DataOps. They expect new DataOps technologies to help reduce the growing friction between the need to innovate faster versus the need to comply with GDPR, CCPA and other data privacy regulations.

Commenting on the report, Aslett said: “It has become clear that if enterprises are to realise business value from the development and delivery of data-driven applications and data-driven decision making, then more agile and automated approaches to database provisioning and data management are required, approaches that are more responsive to changing business requirements.”

How DataOps supports DevOps

DevOps are pretty good at managing technology, process and organisational bottlenecks when it relates to application delivery and infrastructure provisioning. They can provision environments demand and deploy applications or micro-services to them, on demand via self-service.

What DevOps struggles with is getting the right data to the right environment at the right time. Speaking with Information Age, Sanjeev Sharma, VP of Data Transformation at Delphix, expanded on this issue.

He said: “If you go to any DevOps conference, very rarely do you hear people talk about data protection; it’s has been done, by a separate team. However, DevOps realise now that once they address the infrastructure provisioning problem and the applied application deployment problem, the next bottleneck becomes data.

See also: How DevOps works in the enterprise – How DevOps works in the enterprise — it’s all about rapidity of release, but without sacrificing and compromising on quality in the digital world. Read here

“Most organisations start their Continuous Integration and Continuous Delivery (CI/CD) pipeline journeys of achieving ‘flow’ in their delivery pipeline by addressing deployment automation of application code, and provisioning automation of environments. While data provisioning as a part of the environment is typically the least automated set of steps.

According to Sharma, this is a problem because DevOps “cannot achieve true CI/CD ‘flow’ without addressing data provisioning – getting the right data to the right environment and accessible by the right practitioners, at the right time, in a secure and compliant manner.”

This is why DataOps is so useful. With DataOps, DevOps is able to get the right data at the right time to be able to ensure that the application they’re delivering is being developed with the right production data, and is being tested using the right data sets.

Sharma added: “With DataOps, DevOps can manipulate data more too. For example, they can create multiple copies of data at various timestamps. And they can test multiple applications which may have multiple databases behind them.”

Avatar photo

Andrew Ross

As a reporter with Information Age, Andrew Ross writes articles for technology leaders; helping them manage business critical issues both for today and in the future