Avoiding an outage disaster: continuous availability

The massive flight cancellations and delays at British Airways, with its painful hit on the stock price of its parent company, highlight today’s intolerance for digital disruption. The airline has cited a power surge following an outage as the culprit, saying the event took out a large data centre. The big question is why disruption of one data centre – even if it were a significant technology location – could so broadly impact operations for a global company like BA.

Few companies can withstand an outage of more than a few minutes without significant financial impact these days. For some organisations, even a few minutes of downtime would prove incredibly costly (think of the $66,000 Amazon is reported to lose per minute of being offline).

>See also: Why did British Airways suffer such an extreme IT meltdown?

Given this significant financial pain, organisations have invested heavily in backup and disaster recovery (DR) systems. As a global airline, BA most certainly had such systems in place – indeed, the company has commented that its DR systems did not intervene as expected.

This comment does raise questions. BA should have had data centre systems in place that would enable it to withstand even a massive disruption in electrical service. Other locations should have kicked in to take over operations.

It’s fair to question, however, why a company like BA would still be relying on a DR plan. Such an architecture works like this: expect that you will have a “disaster” in one location, then have systems elsewhere ready to turn on so you can “recover” from the disaster.

>See also: Down and out: 3 lessons learned from Amazon’s 2 hour outage

Simply put, DR isn’t good enough today. Customers, partners, and internal users require seamless access to your apps and data – without that, your business can’t run. So what’s better than DR? Continuous availability. With that architecture, no matter what failures happen at a device, data centre, or cloud region level, the same operations are already running in other locations and they simply take on more of the load.

How does IT deliver continuous availability? By running active/active architectures with applications, servers, databases, and network capacity operating in multiple locations at once. This approach, with traffic running across a broader geographical domain – and often spanning on-premise and cloud-based resources – provides far greater resiliency at the application layer. This philosophy says “failures will happen – you just shouldn’t have your whole system go down as a result.

>See also: Best-practice takeaways from internet outages

Organisations are relying on scaled out server architectures, modern databases, database load balancing software, and cloud services to enable active/active architectures that ensure business continuity.

The toughest technology layer to support active/active operations is at the application tier, because apps are fed by data and are typically written to directly talk to the data tier. Leveraging database load balancing software provides the buffer between apps and the data tier to support active/active operations without changing the application code.

The move from DR to continuous availability is paramount, and BA’s pain is simply example of a company paying the price for not getting to that business model already.


Sourced by Michelle McLean, VP of marketing at ScaleArc


The UK’s largest conference for tech leadership, TechLeaders Summit, returns on 14 September with 40+ top execs signed up to speak about the challenges and opportunities surrounding the most disruptive innovations facing the enterprise today. Secure your place at this prestigious summit by registering here

Avatar photo

Nick Ismail

Nick Ismail is a former editor for Information Age (from 2018 to 2022) before moving on to become Global Head of Brand Journalism at HCLTech. He has a particular interest in smart technologies, AI and...