How Pfizer uses data virtualisation to accelerate R&D

Pfizer is the second-largest pharmaceutical company in the world. As with all drug-makers, research and development is the engine of its business. Pfizer devotes a sum equivalent to more than 15% of its turnover to R&D every year – $7.8 billion in 2009.

At the heart of Pfizer’s research arm sits Worldwide Pharmaceutical Sciences (PharmSci), a group of top scientists charged with designing and synthesising all the drugs involved in clinical trials. PharmSci runs a complex, constantly changing portfolio of projects that require vast quantities of data.

Pulling that data together used to put a break on the process of analysing new drugs. “Fundamentally, our challenge was that we had a rather regimented approach to data integration using standard ETL tools,” recalls Dr Michael Linhares, a research fellow and leader of the Business Operations Group within PharmSci. “It always ended up taking longer and costing than we wished.”

During the month-long periods that it would take Linhares’s department to compile the data that analysts need, their requirements would often change. “We would sometimes even gather requirements from customers, put something together and, by the time we had finished, they no longer need it,” he says.

Interesting Links

In depth: Data virtualisation

Prompted by a budget cut that meant his department needed to become more efficient, Linhares weighed up his options. One approach was an enterprise information integration (EII) project that could unify access across the organisation’s standard Oracle databases.

“The other was just to streamline the way we do ETL with our Informatica server and minimise the amount of data moved,” he says.

However, when he examined data virtualisation technology, still emerging at the time, it seemed to be the perfect solution. “It addressed a lot of our issues, allowing us to come up with ideas, rapidly prototype solutions, and put them in front of the customer.”

Although the organisation used software from Informatica to manage its data warehousing infrastructure, at the PharmSci’s data sources into a single reporting schema of information that can be accessed by users.

A wide variety of disparate data sources are integrated under PSPD, including Enterprise Project Management, a SQL Server database of drug portfolio project plans; the Global Information Factory, an Oracle-based data warehouse of financial data; and, OneSource, a database of drug portfolio information.PSPD serves data to applications time it did nothave a data virtualisation offering.

Linhares came across industry pioneer Composite Software. “The main benefit of Composite was that its interface was very easy to use – very friendly for someone who knows SQL well and has an understanding of how to integrate data,” he explains. Connecting to data was straightforward, he adds, and the caching engine was strong.

“The ability to optimise queries within Composite and cache them was a huge benefit at the time.” Linhares used Composite’s data virtualisation platform to build PharmSci Portfolio Database (PSPD), a federated data delivery system. This integrates all of including SAP Business Objects for ad hoc queries, standard reports and dashboards, and Tibco Spotfire for analytics. Composite Software cannot rest on its laurels, though: Linhares is constantly evaluating the merits of rivals, given the huge boost that data virtualisation has provided his organisation.

“The Informatica data services offering is very structured around the Informatica development process,” he says.

 “If a developer is comfortable within the Informatica environment then it’s a very good solution for them, but if you’re not an expert there’s a huge learning curve.”

For now, though, Linhares is sticking with Composite as it remains a simple yet flexible tool, he says.

Related Topics

Data Virtualisation