Availability – the changing role of business continuity

Keeping systems running despite whatever life might throw at them is a significant challenge.

Business continuity covers the processes, people skills and technology involved in helping businesses survive issues, from IT problems like lost files through to large-scale disasters that affect thousands.

However, can new approaches make availability into something we take for granted? And should we be asking the same questions we currently do around business continuity and data?

How we got to here

Continuity has attracted more attention this year due to the impact of security issues like ransomware and businesses getting media attention during incidents. Even with the best IT planning and preparation, incidents can lead to downtime and frustration for customers.

>See also: Avoiding an outage disaster: continuous availability

For example, BA’s data centre issues this year have led to flights being grounded on several occasions, while ransomware attacks on retail firms, NHS Trusts and logistics companies alike all led to serious disruption too.

Being prepared for these issues should therefore be in the best interests of every business. Traditionally, business continuity has broken down into two disciplines: availability and disaster recovery.

Availability planning is dedicated to helping services survive any incident, while disaster recovery (DR) focuses on getting systems back up and running as fast as possible after an event.

For most businesses, availability was incredibly expensive to achieve, requiring multiple data centres and hardware systems to be in place. This meant that the number of enterprises that could afford true continuous availability was low.

While banks, utility providers and national security organisations would invest in keeping their systems able to run come what may, other organisations looked at how to detect and recover swiftly.

Even this could get expensive quickly. First, the cost of recovery to hardware could require significant additional cost, while the time to get back up and running from tape was also high. Secondly, the recovery process itself could be a difficult one.

>See also: Disaster recovery: what will you do when the lights go out?

According to the most recent DR Benchmark report, around a quarter of organisations don’t test their DR processes at all, while 65 percent of organisations don’t successfully complete their own test recoveries.

This adds up to a lot of organisations that either don’t know how much at risk their data is, or are not able to successfully complete their DR processes. However, we can make that data available and useful in new ways today.

Taking a different path

Technologies like virtualisation and cloud have developed that have made big impacts on DR planning and recovery. Rather than needing specific hardware set-ups and precise mirroring of existing IT assets, companies could use virtualised instances to recover faster and at less cost. However, these approaches still need testing and tending to get right.

Instead, we should look again at the concept of availability around applications and the data they use. How can we change our approach to how systems run in the first place, so that problems just don’t affect us? Can we make our systems more resilient rather than focusing on faster recovery?

>See also: How understanding risk is making data safer

This requires a mindset change for how applications are designed and supported. Rather than the centralised model for how systems are put together, how can we spread these applications more widely?

Rather than looking at making the central individual elements more resilient, how can we make our services run through any problem, no matter whether it affects a single machine or a whole data centre?

Looking at new models for data can help. Instead of designing systems for fast recovery, we can rethink the data model involved so that information can be spread across multiple locations and service providers.

With all data copied to multiple sites and systems, any individual problem will not affect the service. Companies can take an “always on” approach to these new applications, prioritising availability and customer experience.

Rather than designing to run perfectly and then recover quickly, always-on systems should be created to work through any failure event. Whereas previously each IT component was based on expensive hardware, today’s service can run on inexpensive software.

By spreading work and data across tens or hundreds of software components, any single failure does not bring down the other assets involved in supporting data.

For example, public cloud services can be used to complement internally hosted services. Alternatively, using multiple public cloud providers can avoid the issue.

>See also: Why critical data can’t be hosted with just one provider

After all, it is unlikely that an Amazon Web Services instance in the US will fail at the same time as a company’s own data centre and a Microsoft Azure instance in Europe.

Testing availability of services

For these applications, testing availability and continuity should be a case of looking at how services can cope with failure.

Taking elements of the application down to see how the service performs can seem like a risky course to take, particularly when many companies don’t test their services themselves.

However, this approach can actually make it easier to see that the availability process works. If the application can survive some of its elements being taken down, then the availability strategy works. At the same time, it should not affect the day-to-day service that customers receive, even during the DR test.

The data management layer should spread the data automatically so that the service carries on with no interruption. While even the fastest recovery will lead to some downtime, an availability strategy will ensure that no service interruption takes place.

>See also: Cloud backup and recovery: helping businesses take on the giants

For customers, the result of this change should be invisible. For the business, however, these critical applications should run more effectively through any issue without any downtime. New services can start with this new approach in mind and ensure that availability is maintained as standard.

For existing services that use more traditional infrastructure, adding a data management layer over the top can provide the same level of availability around data for those legacy applications. This can extend the availability of existing data sets dramatically.

Business continuity planning remains a critical skill for IT teams within enterprises. Planning ahead to reduce the risk of interruption to any business service will continue to be valuable, no matter whether the service is based on existing technology or new cloud applications. Indeed, having additional copies of data in multiple places is the essence of business continuity.

>See also: Incident response: 5 key steps following a data breach 

However, looking at the economics of how to make digital services more resilient has changed as new options like cloud and distributed computing have come to the fore. At the same time, considering availability from the start should help companies provide better services to their customers.

By looking at new approaches to managing data, companies can reduce their spend and improve the customer experience that users get from the start.

 

Sourced by Martin James, regional vice president, Northern Europe at DataStax

Avatar photo

Nick Ismail

Nick Ismail is a former editor for Information Age (from 2018 to 2022) before moving on to become Global Head of Brand Journalism at HCLTech. He has a particular interest in smart technologies, AI and...