At Big Data LDN, Greg Hanson, VP EMEA at Informatica, discussed with IA the benefits of an end-to-end data engineering approach to achieving clean data that you can trust for successful AI initiatives. This approach includes data discovery, ingestion, integration, quality, prep, and governance.
It’s hard to meet a business leader who isn’t excited by the potential of AI and predictive analytics in helping drive their organisation’s efficiency. Fraud detection, next-best-action, operational efficiency and forecast analysis are among the many business challenges that AI and analytics can help solve. However, bad data is currently hindering AI since machine learning (ML) models are only as good as the data you feed them.
“Looking back at the early days of big data, it’s clear a lot of people lacked a maturity around data quality,” said Hanson. “Many people working on AI initiatives wrongly assume that their AI engine will be able to remove inconsistencies and inaccuracies in their data automatically. This is not the case; if you put garbage in, you’ll get garbage out.”
Improving data quality
To enhance the quality of data, Hanson argued that enterprises should build a catalogue of assets. This drives informed decisions — once you’ve got a centralised catalogue of assets, you can start to do things like bringing data together and making it searchable.
He said: “This is going to enable organisations to train their AI and ML algorithms with a more complete, more comprehensive and less biased sets of data.”
According to Hanson, this can be done by using good data engineering tools with AI built-in.
“What we actually need is not just artificial intelligence in the analytics layer — in terms of generating graphical views of data and making decisions in real-time around data — we need to make sure that we’ve got artificial intelligence in the backend to ensure we’ve got well-curated data going into our analytics engines.”
Information Age Roundtable: Developing a data and AI strategy
He warned that if organisations fail to do this, they won’t see the benefit of analytical AI going forward.
“In my opinion, a lot of mistakes could be made, some serious mistakes, if we don’t make sure that we train our analytical AI with high quality, well-curated data,” said Hanson.
He added, if the data sets aren’t good, then AI advocates in organisations are not going to get the results they expect. This could hinder any future investment in the technology.
The CTO role: ‘It’s about planning and business opportunities’
However, Hanson is an optimist. He argued that thanks to the cloud and next-gen analytics, making data available in an AI-supported self-service manner to business analysts and data scientists has become more accessible.
“AI tools can now help to make sure that that propagation of data and the lifecycle of that data is tracked, traced and made available to organisations to help manage,” said Hanson. ”
Simply, AI has the potential to be used to pull insights from data sets, helping discover commercial opportunities. As well as assessing data readiness and technical feasibility.