Once or twice every decade, a catastrophic event strikes a business hub with tragic loss of life and massive business disruption. The Loma Prieta earthquake south of San Francisco in 1989, the IRA bombing of the City of London in 1992, the Kobe earthquake in Japan in 1995, and, of course, the devastating terrorist attacks on New York and Washington in September 2001 all underscore the vital – and evolving – role of business continuity and disaster recovery planning.
The frequency of these events might seem all too consistent, but the sheer scale of the disruption to IT-based business functions is growing exponentially. The role of computing, especially in sectors such as financial services, meant that in New York, in the absence of an IT infrastructure, whole businesses ceased to exist – at least until their disaster recover plans swung into action.
Additionally, the attacks on the Twin Towers sent a resounding message to business leaders responsible for ensuring their organisation can recover from a major disaster. While almost all previous situations had been defined by their local impact, the global terrorist threat means that almost any building around the world become a potential target – a European headquarters in London, a data centre in the US, a manufacturing plant in Asia.
That has prompted organisations to take a long, hard look at their business continuity strategies – the location of ‘hot standby' facilities, the viability of their recovery plan, the level of automation of data back-up, and so on – and to examine how a serious disruption might severely impact their operations.
"The World Trade Center (WTC) was one of the first incidents that really tested the advanced recovery and high availability services that are now available within the business community," says Keith Tilley, managing director for Europe at services provider Sungard Availability Services.
For example, recovery technologies that rely on frequent data back-up and data mirroring helped avert massive data loss for many organisations in lower Manhattan, while arrangements with recovery service providers enabled many to move and restart their operations relatively rapidly at alternative IT sites.
Organisations that had well-planned business continuity strategies – even some located within a few hundred yards of the Twin Towers – were able to recover relatively well (see Restart from ‘Ground Zero'). But many others did not.
While most organisations had taken sufficient measures to protect their core centralised computing infrastructure, the end-user environment of businesses did not have a similar level of contingency planning, says Todd Gordon, vice president and general manager of business continuity and recovery services at IBM Global Services. Companies failed to protect end points on systems such as workstations and router configurations, and they failed to make basic provisions for items such as desks, printers, voice mail and a customer service dial tone, he adds.
Early business recovery strategies tended to focus primarily on restoring IT equipment and connectivity after a major outage. But that is not enough "Business continuity is about preventing problems and maintaining the business. It is about covering the business processes, the command, control and communications and people aspects," says Robin Gaddum, managing consultant at UK-based services supplier Guardian iT.
Above all else, what has become evident is the vital importance of IT people following a major disaster. "Prior to 11 September, every business continuity plan assumed that organisations would have access to the majority of their staff," says Gaddum. That assumption has changed.
In tragic circumstances, for example, one of Sungard's customers lost every member of its business continuity planning department in the Twin Towers attack, says Tilley. "Clearly, putting all of your staff under one roof in the building next to yours has inherent security vulnerabilities. [Now] organisations are looking at distributing their [recovery] staff more widely," adds Gaddum.
For the six months since the attacks on the US, however, the emphasis has been more on re-examination than on action. Research points to little having changed in terms of investment in new services and technologies. "There is a lot of inquiring and analysis going on, but so far not the increased commitment in dollars to implement end-to-end continuity plans," says Gordon.
In one survey, IT services vendor CMG found that only 9% of UK organisations with more than 500 employees had reviewed their disaster recovery plans since the rise of the new terrorist threats. "Spending is not occurring at the rate you would think it should relative to the deficiencies [in many organisation's strategies] ," says Gordon.
But business continuity strategy is not something that can be put in place in a matter of weeks. "I don't think we will be able to measure for about another two financial quarters where [new business continuity] funds are being directed," adds Gordon. New contracts at service providers, he notes, take between six and nine months to agree and implement.
One common reason cited for a lack of new activity is that many organisations already feel they have adequate business continuity provision in place. For example, Ian Campbell, CIO at UK telecoms and service provider Energis, says, "There has been no change to strategy at the company from before the events of 11 September, other than a review and assessment of any possible lessons that could be learned."
Similarly, Peter Cox, systems director at UK food retailer Waitrose, says there has been no change in its strategy because, "We had [business continuity provision] firmly in our sights already."
But such companies are probably not the internationally recognised symbols of western business that are most at threat. Companies with global brands have taken a much deeper look at their exposure – but most, not surprisingly, are unwilling to comment on their continuity plans.
The difficulty is that much of disaster recovery planning is subjective. Companies have to assess what they are trying to protect themselves against, the likelihood of it occurring, and the cost of avoidance or recovery, says Cox.
Levels of business continuity requirement also vary enormously. For many financial instutions and those with high levels of online transactions, even a few minutes of downtime is too much. Others can get by for days.
Steven Hunter, operations manager at CMP Information, the professional media publishing division of United Business Media, admits his company did not have a business continuity strategy in place before September 2001. That "kick-started" the company's plans, he says.
Like many, the company has decided that instantaneous recovery is not necessary. "The business has told [IT operations] that they can afford to be down for about two days before it starts to have any sort of major impact on them," says Hunter. "The way the business viewed the risk was that if a disaster was to strike on the day it was going to press, it would be a major problem." But as CMP spreads its publication of various journals and magazines across the calendar, managers think the organisation could survive for a couple of days, he adds.
"Organisations would love instant recovery, but what they are demanding is typically between two and four hours for major applications," says Phil Jones, director of business development for Europe, Middle East and Africa, at storage specialist Hitachi Data Systems.
Providing fast recovery may not be easy but the technology is there to help organisations ensure downtime is minimised. One crucial technology, for some organisations, is real-time data mirroring. Investment bank Schroders, for example, synchronises the transfer of data between its two London data centres located on either side of the River Thames using data mirroring software from storage vendor EMC.
"The beauty of having that kind of failover is that an organisation's return on investment is being maximised, as there is very little redundancy. The only issue is whether the organisation has enough capacity at one of the data centres to run all of its key systems in the event of a disaster," says David Jennings, technical business consultant at EMC. Schroders tests the capability of one data centre to support the operations of both every three months.
However, replicating or mirroring data in near real-time, particularly over long distances, is not cheap. "If you ask a hundred CIOs, all hundred would like to [use geographically dispersed data centres]. But can they afford to do it? The answer, for most, is no," says Charles Rutstein of Forrester Research.
Data replication costs are largely determined by distance. "If you are within fibre line distance (within 50 kilometres), then it is not too expensive. But if you have T1 or T2 lines over a long distance requiring lots of telecommunications work, then it becomes very expensive," says Donna Scott, research director at IT industry research group Gartner.
Clearly, traditional methods, such as backing up to tape once or twice a day and then sending tapes to a secure off-site location, is much cheaper, despite being more labour intensive. Now companies are looking to back up far more frequently. "We back up the data on our mainframe continuously in real-time," says Cox at Waitrose.
Without regular rehearsals of disaster scenarios, though, weakness can lie undiscovered. Ian Bowdidge, business continuity manager for Europe at Hewlett-Packard, warns: "We are still amazed at the number of customers who come to us and either can't read their back-up tapes, or their back-up tapes do not contain what they think they contain."
Other lessons have emerged from the disaster recovery efforts in New York. Protection against data loss needs to be much more extensive, says IBM's Gordon. On 11 September, most of the data lost was on user devices, he says. "There were over 100,000 workstations lost or destroyed. And backing up the information on those workstations was left up to individual users. That contrasts sharply with the solid data back-up policy most companies had for their core server systems where data loss was minimal."
Over the coming months, those kinds of lessons will be applied at major corporations around the globe as non-IT executives come to appreciate the impact a loss of computing function can have on their organisations. "Business continuity has to be led from the top. If you haven't got board-level commitment, then you don't know if what you are doing is what the business needs," says Bowdidge.