It has always been part of the IT department’s remit to maintain the quality, availability and utility of data. But as the responsibility for procuring applications is devolved to business units and users achieve greater autonomy over their devices, managing data is arguably becoming IT’s primary role.
This was reflected in a recent report by research firm the Corporate Executive Board. The research found that in 2012 CIOs intend to spend 39% of their IT budget on information management (up from 36% last year) compared with 32% on business process automation. Until 2011, automation had always trumped information in budgetary allocation, the firm reported.
Similarly, analyst company Ovum predicted that information management (including business intelligence and analytics) will be the fastest-growing segment of the global software industry over the next four years, with a compound annual growth rate of 10%.
“The volume of information within enterprises continues to grow at an astonishing rate, and investment is needed to both manage this information and turn it into actionable intelligence,” said Ovum analyst Tim Jennings.
Another clue came in the form of Hewlett-Packard’s surprise $10.3 billion acquisition of UK information management provider Autonomy. “Together with Autonomy, we plan to reinvent how both unstructured and structured data is processed, analysed, optimised, automated and protected,” said then-CEO Léo Apotheker when he announced the deal.
Soon after the announcement, a report by the Bloomberg news agency alleged that HP had approached integration vendor Tibco and data warehousing specialist Teradata before making an offer for Autonomy. If true, this suggests that HP’s primary objective was to acquire a profitable, second-tier enterprise IT company, but it also confirms that the data management space, as opposed to business applications, was the most likely to offer such a target.
The benefits of improving data management are far-reaching and undeniable. For the London Borough of Brent, for example, applying master data management technology from IBM Initiate to build a citizen data index has led to a number of positive outcomes, as the borough’s information governance manager, Raj Seedher, told Information Age in October.
It has promoted information sharing by increasing confidence in data quality, and it has reduced administrational error, Seedher explained. It means that noise control officers know when they have been called to a dangerous neighbourhood and missing persons can be tracked more effectively now that family names have been formalised, and the index has helped the council save £1.2 million by thwarting tax fraud.
In the business arena, Greece’s Alpha Bank used integration technology from Information Builders to build a detailed view of operational and customer data, allowing it to assess its true risk exposure. This was not enough to protect it entirely from the country’s financial crisis, but may well have prevented even greater disruption.
By way of a negative example, in March the National Audit Office discovered that the Ministry of Defence’s logistics operations have been severely hampered by poor-quality data. It found that logistics systems, some of which were 30 years old, were poorly integrated, meaning that it was impossible to see a complete view of the MoD’s inventory. The NAO found that nearly 50% of equipment deliveries failed to arrive on time, often because an item was not in stock.
Investing in data
Despite the evident benefits of good data governance, it remains difficult to secure investment for large-scale, multi-year data management projects. These projects may serve the long-term, strategic interests of the organisation, but their financial benefits can be difficult to calculate and they may not deliver any benefit for a number of years. “These projects are like building a new motorway,” explained Edwin van der Ouderaa, global head of Accenture’s financial services analytics practice, in July. “Everybody will benefit, but nobody wants to pay for it.”
Van der Ouderaa advised, therefore, that organisations split large data management projects into smaller projects with short-term benefits and lower costs. “It’s much easier for the CIO to do that than to say, ‘I need £100 million and five years to do this.’”
The risk involved in high-cost, multi-year projects means that large organisations are no longer inclined to attempt to construct a single repository for data that serves the whole organisation’s needs. The length of time required for this kind of project is so long that strategic objectives are likely to have changed by the time the project is complete, and the amount of money required is too much to gamble.
Fortunately, technologies are emerging that help companies to build trust in data without constructing a megalithic central repository. One example is data virtualisation. Forrester Research describes data virtualisation as “a technology that abstracts, transforms, federates, and delivers data taken from a variety of heterogeneous information sources. It allows consuming applications or users to access data from these various sources via a request to a single access point.”
Done correctly, data virtualisation can improve data quality by removing the need for duplicate data stores – new ‘virtual stores’ can be created that simply replicate the original record.
One organisation using data virtualisation is drug giant Pfizer, which used the technology to integrate the data required by its drug discovery analysts. That data was typically contained either in operational data marts or in spreadsheets, and bringing it all together was a drain on the analysts’ time.
Using a data virtualisation product from Composite Software, Pfizer was able to reduce the time required to produce a new report by 90%, which led to a 100% improvement in analyst productivity.
Forrester Research predicts that adoption of data virtualisation will increase in the coming year as the technology matures an
d awareness of the benefits increases. It is not the only emerging data management technology, however. When the US Department of Defense (DoD) wanted to find a way to integrate its multiple human resources databases without creating a single, large repository (a previous attempt to do just that had failed miserably), it evaluated data virtualisation.
But as Dennis Wisnosky, chief technology officer for the DoD’s Business Mission Area, told Information Agein November, data virtualisation requires data to be “translated” from one definition to another, which consumes processor cycles. Given the scale of the queries the DoD wanted to run on its HR database (it is the largest employer in the world), this “translation” burden meant that queries took too long.
Instead, the DoD has used semantic technology to define common definitions of data. Using the Resource Description Framework (RDF), it built an ‘ontology’ of concepts with which to categorise data. For example, the ontology allows systems to understand that an Army serviceman, an Air Force pilot and a Navy officer are examples of the concept ‘service member’.
This has allowed the DoD to answer questions such as ‘how many Arabic-speaking service members do we have in Afghanistan’ much faster than ever before. It is currently rolling out the technology for use by some of the Pentagon’s most senior officials.
Data virtualisation and semantic technology both demonstrate that, although maximising the value of data assets has been one of the IT department’s key responsibilities for decades, innovative techniques that allow it to do so more successfully and economically are still emerging.