Intelligence builder

In order to get the best deal from its suppliers, it stands to reason that an organisation needs to know how many suppliers it uses, and on what terms. But when Brunswick, a $3.8 billion US-based manufacturer of recreational equipment from boats to billiards tables, attempted to calculate the size of its supplier base,

 
 

Hands on: Dennis Publishing

Dennis Publishing produces a range of consumer magazines covering a broad section of interests. Several years ago, the company outsourced its subscription management and web site hosting functions to third-party specialist companies. However, it soon became clear that Dennis employees needed direct access to the information those third parties were capturing on its behalf.

Moreover, Julian Thorne, Dennis' circulation director, believed that the company needed to know how this data overlapped. For example, he saw great value in being able to determine which website visitors showed a clear interest in a particular subject matter covered by Dennis publications but who did not already subscribe to print editions of those magazines.

In order to provide these capabilities, systems integration company Lateral created a data warehouse, which combined a subset of highly granular data gathered from both external sources. "From the outset, Lateral made sure our expectations were deliberately set low," recalls Thorne. "It was made clear that, at worst, all that we were going to find out from the project was that our operational data was in a bad condition, and that the project could simply lead to us just sorting out our operational systems."

This ‘worst case scenario' did not materialise – ‘bad' data was corrected relatively easily. But extracting, transforming and loading the data was a massive task in itself. The ETL system delivered by Lateral comprised 108 separate bespoke jobs for extracting data from some 356 different source files. This data subsequently populated 200 tables with 2,512 attributes. Most of the data was taken from either AS/400 or Unix platforms, and the target database was a DB2 database running on an RS/6000 server.

Already, Dennis is starting to see tangible benefits. "The data warehouse has been running for six months now, and most of our analysis so far has been quite simplistic. But simple things like making sure we are not offering a reduced rate introductory subscription offers to existing customers have already saved us a lot of money," says Thorne.  

 

it found that its initial estimate of between 10,000 and 12,000 suppliers was wildly off the mark. In fact, the true figure was closer to 30,000.

The reason for the discrepancy was historical: Brunswick built its business through acquiring smaller companies, which it incorporated into its overall structure as new divisions. Each division had its own systems, its own suppliers – and its own supplier data. Only by consolidating this data using extraction, transformation and loading (ETL) software from data analytics specialist Informatica, and then analysing it using Informatica’s analytic applications, was Brunswick able to get a clearer picture of its procurement activities.

Like Brunswick, most organisations find that achieving a ‘single version of the truth’ relies heavily on successful data migration. In fact, along with data cleansing and analysis, ETL is key to a successful business intelligence project. In the ETL process, data is first acquired from its input source, validated and transformed according to particular job specifications, and then loaded in a standard format to a new repository.

However, while the basic objective of ETL – consolidating data – is straightforward enough, the process is complex and time-consuming, even with the help of tools that to some extent automate the process. Before a byte can be shifted, technology decision-makers need to assess the attributes of data residing on disparate legacy systems, and decide which tools are best suited to the task. And even at that early stage, the real problems quickly become apparent. “It’s no good carrying out an ETL project if you are taking inaccurate data and just combining it in different ways. When you try to get a single view of anything it means combing data from multiple source systems and that is only easy if the source data has standardised values,” points out Jay Huff, director of business development and marketing at Ascential. Since between 40% and 70% of business intelligence project budgets are typically set aside for data migration, mistakes will likely prove costly.

Home-grown migration
Many IT departments still handle data migration manually by coding their own migration routines. However, success on one project or type of data can lead organisations to assume they can apply the same routines and procedures to other corporate data resources. “[These companies] do a small data march, create programs that seem to work quite well, so they then try to scale that up,” says Huff. “We have been into [clients] that have literally 300 or 400 extract/transform teams, and have data staging areas sitting on lots of different databases, and they end up with a little cottage industry within the business.”

“The ETL process can become a beguiling end in itself,” agrees Huw Ringer, business director at systems integration company Lateral. “One client of ours spent two years and millions of dollars purely figuring out how to get ‘end to end’

 

The ETL landscape

‘Pure play’ vendors
ETL represents a core competency and accounts for most (in some cases, all) of these vendors’ license revenue. Vendors in this category include Ab Initio Software, Acta Technology, Ascential Software, Data Junction, DataMirror, Evolutionary Technologies International (ETI) and Informatica. This class of vendors is driving the bulk of innovation and ‘mind share’ in the ETL market.

Business intelligence vendors
Business intelligence tools and platforms are their core competency. For most of these vendors, ETL technology plays a supporting role to their flagship business intelligence offerings, or is one component of a broad offering including business intelligence and ETL. Vendors in this category include Cognos, Hummingbird, iWay Software (a division of Information Builders), Sagent and SAS Institute.

Database vendors
The ‘Big Three’ database vendors – IBM, Microsoft and Oracle — have an increasing impact on this market as they continue to bundle ETL functionality closer to the relational DBMS.

Other Infrastructure Providers
They provide various types of technical infrastructure components beyond the database environment. For these vendors, ETL is typically positioned as yet another technical toolset in their portfolios. Vendors in this category include Computer Associates and Embarcadero Technologies.

Source: Gartner Research  

 
 

metadata into and out of their chosen ETL tool without ever actually transferring a single byte of real live data onto the target platform that the business could use.”

Five years ago if an organisation could support flat files, then it could – with the right in-house skills – probably handle ETL on its own, says Ringer. But as organisations increasingly integrate external data from customers, suppliers, and distributors with their own, the situation has become exponentially more complex. “Today a lot of business to business data is XML-based, and that is not standardised, so you need the ability to take it in whatever form it comes and translate it into something more standard.”

This is an issue that will take organisations some time to tackle. According to a recent report from market research company Giga Group, “XML adapters that enable the major ETL vendors to consume and produce XML out of the ETL engine have been shipping for more than a year, although the adoption is still sparse. This is because XML is such a discontinuous technology that end-user installations are still learning (and defining) the rules of the game”.

Product proliferation
The ETL market has grown considerably over the last few years – from a core group of ETL specialists to a market that comprises vendors with a range of approaches and perspectives (see box, The ETL landscape). “If you look at all of the companies in the marketplace, you’ll see they have different heritages,” says Goldsbrough. As a result, customers find that a single ETL solution only goes part of the way to addressing its needs. “Every ETL solution out there is pretty good at extraction. The question you have to ask is how many sources can it extract from?” says Huff.

Kevin Magee, sales director at Information Builders, says that these concerns have prompted the company to take a different approach to ETL with its Iway software. In particular, it is challenging the established concept of physically moving disparate data into a single repository. “Many times this is the best solution. But a single view can also be achieved through combining middleware with the metadata layer,” he says. “When integrated with middleware, the metadata layer provides a single view of the data, wherever the data is actually stored.” Using this approach, Magee claims, IWay can access over 85 different database types on many different platforms.

Ascential is also focussing on offering a so-called ‘end-to-end’ product that addresses a wider range of ETL needs. In the past year, it has made a number of key company acquisitions to that end, including its recent acquisition of data quality and cleansing specialist Vality Technology. “We think we have all the pieces now,” says Jay Huff. “When I joined Ascential about eighteen months ago, I was surprised to see a data integration market as fragmented as it was. You have ETL vendors, profiling vendors, and [companies specialising in] data quality. It makes no sense to me, as the process demands all of those things.” And as the process becomes exponentially more complex, the bewildering choice faced by IT decision-makers is unlikely to become clearer.

Avatar photo

Ben Rossi

Ben was Vitesse Media's editorial director, leading content creation and editorial strategy across all Vitesse products, including its market-leading B2B and consumer magazines, websites, research and...

Related Topics