What can high profile outages really teach businesses about IT resilience?

In the face of high-flying outages, there has never been a better time to question the realities of business continuity and disaster recovery.

It seems like only yesterday we were talking about global IT outages experienced by United and Delta, and then the British Airways’ outage brought nearly 500 flights to a standstill. In the face of these continued, high-flying outages, there has never been a better time to question the realities of business continuity and disaster recovery (BC/DR) as it evolves.

This permeates throughout the aviation industry and into the broader business landscape as we ask ourselves time and time again; how things can go so drastically wrong in this day and age of exceptional technology advances?

>See also: Another day, another deadline making IT outage

Business success starts with preparedness

This well-publicised, most recent example is one of the many repercussions associated with having an outdated IT infrastructure and strategies incapable of achieving not only BC/DR, but true IT resilience.

This evolution from simply having a second copy of data to rely on in the event of downtime to business agility and data mobility – specifically utilising the cloud – speaks to a world where uninterrupted business is expected.

Unfortunately, though not surprising, while many fall into the group of ill-prepared organisations – and with a power surge being the specific culprit in BA’s instance – an effective IT strategy accounts for all contingencies and builds in the proper redundancies to greatly minimise the impact. The very real consequences of IT downtime for an airline is the grounding of hundreds of flights, which is highly palpable:

• The negative impact to thousands, if not tens of thousands of travellers, resulting in significant brand damage incurred globally across a wide range of media types.
• The staggering financial losses, which in BA’s recent case is estimated to top €100 million.
• The concern from the C-suite and all shareholders who invest in such public companies.

>See also: Down and out: 3 lessons learned from Amazon’s 2 hour outage

Uninterrupted IT business implications

As BA’s incident demonstrates, enterprise-class organisations are realising – albeit through difficult experiences – how inflexible and outdated their IT infrastructures are, resulting in the inability to thrive in the face of challenges.

The reasons for these IT outages are quite broad, covering everything from hackers to catastrophic natural occurrences, to more mundane though highly common factors such as power failures and human error. Because of this, global organisations do not have the luxury to focus only on prevention. A proactive approach rather than a reactive one will ensure access to mission-critical data and applications in the event of a major IT incidents.

IT resilience means being able to dynamically respond to disruptions that impact business operations by ensuring critical data and applications are always available, creating seamless business continuity.

>See also: HSBC suffers IT outage

Recent advancements in enterprise class IT resilience cloud solutions have made this task much simpler, faster and more cost effective as companies shift from using CAPEX-based infrastructure to OPEX-based public cloud platforms such as AWS, IBM Cloud, and Microsoft Azure.

Equally important is enabling enterprises to regularly and successfully test disaster recovery infrastructure with little to no business disruption, which is imperative for highly regulated industries to maintain compliance.

Far too often DR tests are either not fully completed, or are unsuccessful due to the burden of incompatible technologies, or cumbersome manual processes that have a high resource and cost burden.

Suffice to say, when an outage occurs because of holes in the IT infrastructure fabric that were not addressed during testing process, it forces hard questions: Does the answer lie in hiring more IT resources, spending more on technology, or rather revamping around the current constraints in innovative ways?

Cloud is the key to IT resilience

Industry leaders are proving to have found and tirelessly seek out innovative solutions to the big IT resilience questions. Incorporating a hybrid cloud approach into an IT resilience strategy can be a simple and affordable tactic to safeguarding data and applications against significant downtime.

>See also: Why did British Airways suffer such an extreme IT meltdown?

IT teams working with a hybrid cloud infrastructure can anticipate issues and move data and applications before the damage hits. Proactive movement of data is impossible in a traditional datacentre, but for those organisations embracing a virtual, cloud-ready IT environment, it is a reality.

Furthermore, in the case of an outage or breach, organisations can quickly react to ensure mission critical systems and data are back up and operational within minutes with virtually no data loss by being able to “rewind” to a recovery point within the seconds of a disruption.

Without the infrastructure dependencies that prevent easy movement, critical applications such as flight booking applications can securely live and move between multiple on-premises and cloud environments as needed and as threats that compromise business continuity arise.

The mass furor surrounding BA’s incident, as well as many others, demonstrates that IT resilience needs to be much more top of mind for all industries. Each time a data centre or IT disaster takes over headlines, CIOs and IT professionals everywhere shudder, and rightly so.

>See also: Avoiding an outage disaster: continuous availability 

Today, those who are most progressive within this group are leading by example. They are innovating by looking beyond their woefully outmoded disaster recovery infrastructure, and embracing new models that provide greater flexibility to keep their businesses seamlessly moving forward through any disruption. Looking back on unplanned outages such as BA’s, it is vital to remember that they carry massive business risk, but are actually highly avoidable circumstances.

As next year’s IT budget planning discussions are already unfolding for many enterprises around the globe, be sure to ask yourself and those involved: Are we truly confident we’re protected from becoming the next outage disaster headline?


Sourced from Peter Godden, VP of EMEA at Zerto

Avatar photo

Nick Ismail

Nick Ismail is a former editor for Information Age (from 2018 to 2022) before moving on to become Global Head of Brand Journalism at HCLTech. He has a particular interest in smart technologies, AI and...