Richard Verhoeff, ecommerce director of European holiday company CenterParcs, believes he can predict precisely when a customer will book a holiday, the location they will book, and how long they will stay there. Verhoeff is so confident in this belief that he claims he could call any customer and tell them where they will be spending their holiday before they even book it.
CenterParcs anticipates how its customers will behave by using predictive analytics, a data mining technique that combines multiple sources of data to forecast the outcomes of particular actions. Using this technology, CenterParcs has slashed its direct mailings to a quarter of what they were previously, yet has still managed to increase revenues at the same time. It also claims an average occupancy rate of around 90% in its holiday homes around Europe – a rate unequalled in the leisure industry.
At its most basic, predictive analysis takes attributes about a customer or transaction, performs a simple linear equation and produces a score. This score could reflect the likelihood of customer attrition, the expected response to an offer, the expected profit, or the lifetime value of a customer.
To make these predictions more accurate, organisations can combine these simple equations with more complex algorithmic techniques such as neural networking or rule trees, which support more complex relationships between different sets of data. Analytics tools allow organisations to run many thousands of these algorithms at any one time. Once the number crunching is complete, the system produces a series of models that can then be tested against a data sample, modified and put into use. As customer behaviour changes, these models can then be adapted as often as required.
CenterParcs, for example, draws on more than 100 million customer records dating back as far as 1982 and combines these with external data sources such as demographic or geographical information. In an average customer modelling exercise, it uses between 60 and 80 variables – everything from whether the customer has booked a holiday with them before to how far they live from the site – to assess the likelihood of them booking a holiday with them to a certain destination at a certain time of year.
For Verhoeff, the benefit of using predictive modelling techniques is that they can either confirm or disprove concepts that may be little more than a hunch in the minds of the marketing department. “It’s all very well to say that you can predict what will happen tomorrow by looking at what happened yesterday,” he says. “But there are a lot of shades of grey in customer behaviour that predictive analytics takes into account.”
Like CenterParcs, more and more organisations are beginning to see the potential of using predictive analytics techniques to gain an insight into future customer behaviour. Predictive statistical analysis has been used for decades in scientific research, but is now beginning to find its way into more mainstream applications. Household goods giant Unilever, for example, uses predictive algorithms to assess whether certain products will produce a corrosive reaction on the skin, a process that previously required extensive animal testing.
All of the established data mining software vendors, including NCR, Sand Technology, SAS Institute and SPSS, offer this capability as a subset of their data warehousing platforms. A handful of start-ups has also emerged in this sector, including US-based Genalytics, Magnify and Sightward. One of the reasons predictive analytics has gained popularity more recently is a steep drop in the cost of hardware required to process the huge numbers of mathematical equations involved in deep statistical analysis. “Only in the past few years has computing power become cheap enough to do this level of analysis,” explains Doug Newell, founder and CEO of predictive analytics start-up Genalytics. “Only a few years ago, generating the same models would have required a supercomputer. We’ve benchmarked our software on relatively inexpensive clusters of Intel servers.”
Newell says that one customer, a pharmaceutical company, acquired the necessary hardware to run Genalytics’ predictive analytics platform for just $25,000. Historically, hardware investments for data mining purposes run into the millions.
Economic trends have also forced many organisations to look more closely at how they can generate some return on investment on their customer data. If a mobile network operator can be 2% more accurate in predicting when a customer is likely to defect to a competitor and react accordingly, this can equate to hundreds of thousands of pounds of additional revenue. “Most of our customers are just trying to keep their heads above water,” says Colin Shearer, vice president of data mining for analytics company SPSS. “Many have spent a fortune aggregating data on their customers and now realise it’s hugely under-utilised.”
So how does predictive analytics technology differ from the standard analytics packages in which many organisations have already invested? According to Guy Creese, an analyst at research company Aberdeen Group, the key difference is in the number of variables and exceptions predictive analysis is able to take into account.
When business managers make assumptions about how customers will react to certain offers, explains Creese, they frequently do not take these exceptions on board, so create simple rules based on ‘If X bought Y, then X will probably buy Z’. Predictive analysis introduces so many more scenarios that the likelihood of this happening is greatly increased. Traditional business intelligence packages, such as those that plug into enterprise applications from Oracle or SAP, simply identify patterns in behaviour and do not interpret them.
According to Judy Bayer, vice president of analytics at NCR, the greater the number of potential variables an organisation can feed into the system, the greater the potential for accuracy. This is why predictive analysis works best in industries that amass huge amounts of data on their customers, such as telecommunications, retail or banking. Capital One, for example, has the equivalent of 85 terabytes of data residing in its data warehouse. “We take data from whatever sources are available. We load it into our systems, analyse it, test our analysis, modify it and test it again,” explains European CIO Catherine Doran. “That’s the only way we can learn going forward.”
UK supermarket chain Co-operative Retail has almost two billion rows of customer data sitting in a six-node data warehouse cluster. According to Martin Willcox, Co-op’s data architecture manager, the company collects as much data as it can on its customers, from real-time point-of-sale reports and basket analysis to loyalty card data and external factors such as age and demographics. Because customer behaviour is so susceptible to change, Co-op reviews its models monthly using the latest available data and repositions its in-store marketing and loyalty card offers accordingly.
One of the cleverest aspects of predictive analytics, claim vendors, is that it can learn from its successes and mistakes. By feeding the results of a campaign back into the system, this adds more variables for it to play with and heightens its potential for accuracy.
One area in which this ‘self-learning’ capability has proven particularly successful is in fraud analysis. Israel-based telecommunications software provider ECTel prevented two cases of potential fraud in a month for one of its customers using a predictive analytics tool from SPSS. The system was able to use the historical information ECTel had amassed on previous cases to suggest that the same type of fraud was likely to happen again. As new frauds occur, these are fed into the system and used in future analyses.
Ultimately however, knowing what is likely to happen in the future is useless unless an organisation is able to act on that information. For Verhoeff at CenterParcs, the main challenge he encounters in using predictive analytics is not in generating reliable predictions, but translating them into plain English for the staff that act on its recommendations.
“Lots of companies forget that it’s not about the tool. It’s about context,” he says. “My employees need to know why they’re suggesting a certain product to a particular customer. They’re not doing it because a piece of technology told them so.”