Data, the oil that greases the cogs of the modern machine. But, there’s a problem. Organisations are struggling to gain business insights from this new power.
Data: if it’s the next oil, is it renewable or toxic?
The Economist magazine famously described data as the new oil. It certainly has the potential to grease the wheels of the digital economy, but with that are both opportunities and threats. Some go further, data they say is the new asbestos. Read here
In short supply
In the market, many enterprise customers are trying to build very big data science teams. Some, are trying to hire hundreds to deal with the explosion of data; with sources ranging from customer input to IoT devices — this will become the main channel.
But it’s not very easy, there’s a huge shortage of data scientists.
There are, as Gartner coined, citizen data scientists — a person who creates or generates models that use advanced diagnostic analytics or predictive and prescriptive capabilities, but whose primary job function is outside the field of statistics and analytics — but they provide a complementary role to expert data scientists. They do not replace the experts, as they do not have the specific, advanced data science expertise to do so.
Even with this, many enterprises are really struggling to establish a citizen data science team, let alone a data scientist team.
Will 2019 see the automation of automation and push up salaries of data scientists?
Data science is described as a multi-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured, similar to data mining.
The real pain enteprises face is on the data side — building the data sets so that they are ripe for data science to be applied. Data is very complex, and when it is collected in enterprise it is not stored for machine learning and data science purposes. It is stored for business purposes; in charts, for example.
Businesses have to transform this business data into the machine learning format, which is called “feature learning,” says Fujimaki. “And basically we have to apply a lot of domain knowledge to run the data.”
“The data part usually takes up 80% of the time in a data science project, and machine learning 20%” — Fujimaki
So, in this climate, where talent is in short supply, but the data keeps flowing, it’s necessary to automate the end-to-end process of data science; including data in the feature pipeline.
Gaining insights and driving actions
Machine learning can forecast, predict and identify new customers, and in financial services, for example, who has the most risk. This prediction* drives the business process automation. The core business is integrated with the business system and triggers some business action automatically. In this way, there are a lot of areas to make a business much more efficient.
Another very important outcome from the machine learning and data science process is business insights. Data is very complex — and industry experts have domain knowledge and intuition — but there is a lot of hidden knowledge behind the huge amount of data entering the enterprise. Machine learning or the data science process can usually uncover something unknown or unseen or unexpected, even for an expert.
How can you make better use of your data?
Example from dotData
dotData worked with a banking customer that applied its platform to predict who are the new customers that would be interested in a mortgage loan type product. They first thought that this product would appeal for younger people. But, what they found was that a very different type of customer was interested in it, people who were a bit more senior in age. It turned out, that this demographic of customer was purchasing this product more than the predicted younger demographic.
This type of new business insight meant the customer could build and design a new promotional campaign to this customer segment; or they can design a new product based on this type of business insight.
Automating the data science and machine learning process produced new business insights from the data.
A deep look into artificial intelligence, machine learning and data science
Data scientists alone… are not good enough
What type of skill-sets do businesses need to allow data science to extract meaningful business outcomes? The first thing is mathematical or statistical knowledge, but at the same time these businesses have to download very big, large-scale, complex data — they need data engineering for this.
“Also, using the same data in solving different business problems, needs different domain expertise,” says Fujimaki.
Data scientists can’t do it all
A good data scientist needs to have a strong mathematical and statistical skill-set, but often, they do not possess business and data engineering skills.
The shortage of data scientists is a hurdle for any successful data science project. But, the problem is: data scientists alone are not good enough to complete a big complex project.
Successful data science projects will need domain experts, design engineers and data scientists.
A very big part of data science project is prediction* — it needs to be integrated with the business system and automatically drive a lot of digital maintenance. This means that businesses need an engineer who understands this data science process and appropriately integrates this data science process into business systems. Fujimaki calls these types of people “data science talents”.
A data scientist is integral, but there are a lot more roles required to complete a data science project.
Solutions, such as dotData, help solve this problem and share the effort and bridge the gaps, by automating data science and machine learning.