Active intelligence



Every single transaction. Every key the check-out worker punches. Every refund processed. Every opening of the till drawer and the time it remains open.

One US retailer is capturing all that data at every retail point, at every one of its supermarket branches, every retail day of the year, for all 15,000 of its till operators.

The purpose of sucking up such vast amounts of microscopic detail? The US retailer (which asked not be identified for fear of alienating check-out staff) is searching for employee fraud by analysing behaviour, comparing the cash register log to punched key data, and spotting anomalies that might indicate missing cash or goods.

The ‘shrinkage’ the supermarket chain is trying to eliminate may only account for around 1% of revenues, but in a sector where low-single digit margins are the norm, that can be the difference between profit and loss.

Data warehousing on such a scale – and at such a cost – may seem to be overkill in an era of IT budget constraints, but it is a symptom of the pressure organisations are under to support faster, more widely spread and more accurate decision making – factors that become more rather than less important in tougher economic times.

Volume switch

The sheer volume of data available for analysis is staggering – and, if not corralled and made accessible, is overwhelming. A recent poll by market research company BuzzBack of 158 senior executives at major companies


In practice: Dell

The build-to-order and direct sales model at PC and server maker Dell has fostered a supply chain like no other. Orchestrating a flow of components from over 270 suppliers, the company holds a mere three days worth of inventory – none of which is in finished goods.

As orders come in, information on them is fed into an enterprise data warehouse. That drives analytics to adjust the component shipments that are needed from suppliers, to configure the assembly lines to address order flow, and so on.

“Huge value is placed on making instant supply chain decisions based on the analysis of vast amounts of consolidated data,” says CIO Randy Mott. “The warehouse drives all aspects of the business, not just decision support but tying it into the real-time decisions on the supply chain. The business must provide ‘any question, any time, anywhere and the correct answer’.”

The correct answer is addressed by keeping the warehouse data as current as possible. Dell, which is running about 100TB of raw data and about 20TB of ‘usable data’ in its warehouse, is moving to refresh the warehouse hourly, from a previous revision rate of three or four times a day.

“If you think of it in terms of someone asking simple questions, then ensuring that the data is no more than minutes old looks artificial. But if it is someone making decisions that can drive the business then there is real ROI [in having that decision made on data that is only minutes old],” says Randy Mott, CIO of Dell.

“Working with an active data warehouse can drive huge efficiency into the supply chain. Knowing that some product is selling out gives you a chance to get some more on the truck and out before the door closes,” says Mott.



(those with revenues of more than $500 million) found that 59% perceived that the amount of data available to them for decision-making was doubling or tripling each year. Some described the sensation as “drowning” or “swimming” in data; others said they felt “frozen”, finding it hard to act because of either conflicting data or data that doesn’t reach them in time to help with key decisions.

“Businesses are undergoing a fundamental shift in the way they make decisions. In today’s environment, decision making occurs more frequently and at all levels of an organisation. It is no longer a semi-regular senior management activity,” say analysts at market watcher IDC.

“The trend towards the democratisation of information and broader decision-making responsibilities demands timely delivery of relevant information to each decision maker,” they add.

That is backed up by the results of the BuzzBack survey. It showed that 73% of respondents felt they were making more daily decisions than a year ago and 53% had less time to make those decisions.

The upshot: missed opportunities for the business – a feeling that was expressed by 49% of respondents.

But companies are not just looking to tease better insight from the vast amounts of detail data they are warehousing, they are also making the resource available to a wider group of decision makers and ensuring that the data is as up-to-date as possible.

Active source

That need for fast and widely distributed decision making is not always well served by the traditional ‘snapshot’ approach to data warehousing. Currently, the dominant method of replenishing data warehouses and data marts is to use extraction, transformation and load tools to pull data from source systems periodically – at the end of the day, week or month.

Moving forward from that, so-called ‘active data warehouses’ draw live data from transaction systems using a more continuous approach such as a message bus, and therefore refresh the warehouse on a much more frequent basis.

The aim is pretty transparent. With more up-to-date data, employees such as shipping clerks, customer service staff or call centre agents can run queries on customer, order or schedule information that is only minutes old. An airline gate attendant, for example, would be able to decide which passenger gets a seat on an overbooked flight by running a quick check on which has the best frequent flier profile. Or a call centre agent can try to push through a sale based on customer information gleaned moments before when the customer visited the company web site.

Two of the pioneers of data warehousing – retail giant Wal-Mart and systems vendor Dell Computer – are turning their vast stores of data into ‘live’ decision-making engines.

Wal-Mart – which already refreshes its 300 terabytes (TB) warehouse with new and updated records every 10 minutes – is currently discussing with its warehouse technology supplier NCR Teradata how to cut that cycle down to two minutes.

As always with Wal-Mart, the cost justification comes from the refinement of the company’s supply chain and the fine-tuning of profitability analysis from regional levels right down to the aisle and product level. “They want to take


In practice: HBOS

Sometimes a project is tested to the limits not in the years after its completion but half way through its implementation.

In April 2001, Halifax Card Services initiated a project to move its core credit card application from one transaction data provider (First Data) to another (EDS). To provide analytic capabilities on top of the new operational platform, the company decided to build its own data warehouse. After proving the concept with a data mart that used the Oracle database and analytical tools from Business Objects, it then embarked on building a data warehouse with Informatica’s PowerMart designed to draw on 300 million records from the EDS source, again with Business Objects at the front end.

Then came the twist. Nine months into the project, Halifax and HBOS announced plans to merge. Within a few months of the deal being signed, the management of the new group decided that the Halifax Card Services data warehouse infrastructure would serve as the model for the new group. That meant it would be augmented with another 300 million records from Bank of Scotland.

“We always knew we were going to grow the business, but even in our wildest dreams we never imagined by how much the warehouse would grow,” says Tony Stewart, the data warehouse project manager at the time. “If you asked most people to practically double the size and functionality of their data warehouse in a couple of months, they’d tell you it would simply not be possible. In fact, we completed it in the four months [between February and June 2002],” says Stewart.

The warehouse, which is fed transaction data overnight from EDS and (again) First Data, now holds more than 800 million credit card records, each containing 2,000 data fields.

As a result, analysts can now drill right down to individual transaction level when producing market analyses and comparisons across customer groups. “Two years ago, we set up a data mart to provide some fairly simple management information. Now we have a data warehouse that drives a business-critical function and sits at the core of many of our decision-making processes,” says Stewart. One of the primary paybacks: In the 2002 financial year, the newly merged bank acquired 1.4 million new credit card customers, giving it a 21% share of the UK market.



further cost out of the supply chain by getting the right goods to the right people at the right time,” says NCR CEO Mark Hurd.

Dell has similar goals, aiming to ensure that data in the warehouse is only at most an hour old (see box, In practice).

Both companies have built enterprise-wide data warehouses, essentially ‘one version of the truth’. But for most, analytical data is still distributed across different parts of the organisation – resulting in duplication and unreliability.

End of the mart

In the late 1990s and into this decade, providing staff with decision-making power often meant delivering a data engine that was fit for the task in hand but which ultimately stood alone from other the corporate data structures.

Such data marts – single subject, decentralised databases – typically contain application specific and aggregate or summary data, not detailed data. In many cases, different data marts also contain duplicate data – customer detail, name and address, income, credit history, and so on. But as they sit apart from each other and are often populated at separate times, produce contradictory analyses from different groups and result in the considerable duplication of effort and cost.

Moreover, data marts, though conceptually alluring have become notoriously expensive to maintain. The cost of running each data mart is put at between $1 million and $2 million per year, say analysts at Meta Group. “It’s the long term support cost of marts that really eat you alive – DBAs, systems admins, network cost, moving all the data to marts, all the data preparation, the mainframe chargebacks, the maintenance pricing on the software and hardware,” says Stephen Brobst, chief technology officer at NCR’s Teradata unit.

Though many large companies have made the decision to move to an enterprise data warehouse, the deployment of data marts does not seem to have stopped. While in 2002, 31% of respondents to the BuzzBack survey reported their organisations had anything between 11 and 100 data marts deployed, in 2003 that figure rose to 38%. One factor there might simply be a greater awareness of their existence.

“Very often these marts are not something that IT has built, but something that is hidden under someone’s desk in marketing or risk management or something like that. But these things, because though they are in the dark, grow like mushrooms,” says Brobst.

In any case, existing data marts are jealously guarded by departments. The reason for that is clear.

Data marts are cheaper and quicker to deploy and are often funded out of a departmental budget, and because the department feels it owns the data, any suggestion that they should be closed down and absorbed into a enterprise data warehouse is nothing short of a political act.

“It is only in recent years when the budget pressure has been more intense, that the need for thriftiness in managing the budget has been able to overwhelm the politics,” says Brobst.

“Data mart consolidation is one of the top three IT projects that can result in cost savings,” says Brobst. And there is plenty of evidence that consolidating marts into a warehouse pays off – despite the often-substantial effort involved. Studying the data mart consolidation at US mobile telecoms company GST, researchers at the Kellogg School of Management concluded that the three-year switch to an enterprise data warehouse, from scores of data marts, produced a return on investment of 65% and saved GST $27 million in just one year. It is a similar story at Bank of America, which claims to have saved tens of millions of dollars by consolidating its scores of data marts over an 18-month period.

Those kinds of moves signal that, at most, large organisations’ data marts are being rolled up into enterprise data warehouses – a single data repository containing consistent data from and about the whole company or at least all of a major division. The aim is to provide multiple business functions and departments different views of the same data. The underlying detailed data, however, is stored only once.

“The drive is towards the elimination of duplicate data,” says Randy Mott, CIO of Dell and previously CIO of Wal-Mart.

Six out of ten large companies, judging from the BuzzBack sample, are currently investing in enterprise data warehousing technology, and another fifth say they will do so within the next two years.

Of course, running enterprise data warehouses are hardly cost free. For on-going support of a large data warehouse, analysts talk of an average of $500,000 per year per ‘subject area’, and a typical data warehouse will have six subject areas. As high as that seems, it is still only equivalent to maintaining two to three data marts, says Teradata’s Brobst.

At the same time, as organisations are consolidating data marts they are also often building operational data stores (ODSs) – replications from the transaction processing system that are used for tactical decision making and operational reporting. But the stress should be on tactical, says analysts.

“An ODS is like a data mart in sheep’s clothing,” suggests Brobst. “Three years from now, those organisations building ODSs will be consolidating these ODSs into the data warehouse.”

The industry’s aim is clear: to enhance the data warehouse capability with new service levels for data freshness and performance that can support both strategic and tactical decision making. At this stage, not all technologies are capable of supporting that active, enterprise data warehousing, but the direction being set by the trailblazers, and the business benefits they are seeing, will cause many to follow.

Avatar photo

Ben Rossi

Ben was Vitesse Media's editorial director, leading content creation and editorial strategy across all Vitesse products, including its market-leading B2B and consumer magazines, websites, research and...

Related Topics