High availability, business continuity, fault tolerance, disaster recovery: all of these labels reflect different approaches to keeping data and applications available to computer systems users when something goes wrong. Established techniques have been built up over the years to help achieve this, ranging from fully redundant hardware (using standby components to eliminate single points of failure), through to software techniques for maintaining identical copies of data in multiple locations such as replication.
Vendors offering high-availability features or products have typically approached the problem from two different directions. One camp focuses on the infrastructure layer to make sure that server, operating systems and databases keep running. Another starts with the application and tries to make sure users aren’t affected even if there is a failure somewhere in the layers beneath the surface. And a third camp relies heavily on tools that run in conjunction with shared storage resources in the form of storage area networks.
Of course there is a huge difference in terms of technology requirements between providing on site and remote “availability”. And there are various different types of “failure”, ranging from data corruption and user errors, application, hard drive and server failures, right through to a site-wide disaster. All require different technical approaches and different configurations of supporting infrastructure.
However, things move rapidly in the computer industry, and over the last few years increased use of server and storage virtualization has added a new element to the mix. Virtualization layers from companies such as VMware, Citrix and Microsoft enable “virtual machines” to be configured in software, which look to an application exactly like a physical computer. Pre-virtualization, users would often devote a single server to running a single application, in order to isolate workloads that would otherwise conflict with each other. But by encapsulating these applications within a VM, they could run multiple applications on a single server for the first time. VMware has claimed that its customers run an average of 10-12 virtual servers on a single physical server, though the figures vary widely depending on what sort of applications and operating systems are in use. The savings can be huge.
All this has some interesting ramifications for availability. By simplifying and cutting costs, virtualization has the potential of bringing availability to a whole new set of customers and applications, significantly broadening the market. It opens up new options for protecting data and keeping applications up and running through both planned and unplanned downtime, locally and remotely. By abstracting application workloads from the underlying hardware and turning them into files it becomes easier to duplicate them, move them around and get them running again on another server.
The most significant new factor is the isolation of workloads from the underlying hardware, which means that workloads can be easily replicated (or, if necessary, restarted) on any other hardware that is virtual-enabled. Previously, that standby hardware would need to be identically configured to the primary systems. That means increased flexibility, no need to implement and maintain a single, uniform platform for availability purposes, and significantly reduced costs due to lower redundancy and therefore higher utilization rates.
Who’s selling what?
So who are the market players? The virtualization infrastructure vendors – VMware, Microsoft and Citrix Systems in particular – continue to build functionality to increase availability into their core products, threatening in some cases to squeeze out competition from third-party vendors. A core part of what they offer is the live migration of virtual machines, which can move workloads to a different server while they are running. This has the potential to eliminate the need for planned downtime altogether.
But there is still plenty of room for other vendors to operate. They range from storage array vendors with their own hardware-specific tools (such as EMC, Hewlett-Packard and NetApp) to software-based replication vendors (such as Double-Take Software and NeverFail).
The storage vendors typically provide the best performance, but require an investment in expensive networked storage resources. Modular, iSCSI-based storage systems such as HP’s LeftHand and Dell’s EqualLogic are capitalizing on the new demand for availability, attracting customers that have previously been frightened away from shared storage by the complexities and expense of classic fiber channel-based storage networks. A related approach is that of the storage virtualizers, such as DataCore Software, FalconStor, SanRAD and StorMagic, which sell appliances that consolidate storage from many different types of storage hardware into a single pool. Virtual servers can use this pool for automatic failover and live virtual machine migration.
Host-based replication tools run on the server rather than within storage arrays. Their big advantage is that they can be tailored to work with specific applications. Without a level of application awareness some data is likely to be lost if the system goes down. Replication tools bring an application – most often Microsoft Exchange or SQL Server – back on line in the same state as it was just before the crash, including any local work in progress. In contrast, the high availability features that come with VMware, called VMware HA, sacrifice application specific availability for broader workload applicability. Both of these software-based approaches can avoid the need to implement the more expensive cluster-aware editions of operating systems or databases.
Smaller companies running mainstream applications are likely to benefit from a combined approach to both local high availability and remote disaster recovery. This is being driven by the increasingly common deployment of physical, virtual, and (soon) cloud-based systems together. Many smaller businesses that have not implemented any form of disaster recovery in the past might decide to do so now. The advantage of asynchronous replication and virtualization for disaster recovery at either a secondary company location or via cloud-based services is that the systems in each site don’t have to be identical. Neverfail’s vAppHA, working in conjunction with VMware’s vSphere HA and Site Recovery Manager products is an example of this approach.
Cloud-based disaster recovery services represent a greenfield opportunity for vendors, including server makers, services vendors, managed hosting firms and phone companies. It’s unlikely this type of business will go to the consumer-oriented cloud services such as Amazon and Google. Traditional hosting vendors like SunGard are already moving in, targeting small businesses that are virtualising their primary sites but haven’t had the resources to set up their own remote disaster-recovery site.
Properly implemented, virtual infrastructure can form the basis of automated backup, retention, business-continuance and disaster-recovery processes. Once set up, any virtual machine that is deployed will be automatically included within the company wide availability plan. Availability becomes a property of the virtual machine. ‘Dial-up’ levels of availability can be implanted, depending on the requirements of specific applications or departments. The higher the level of availability, the more the expense – so careful planning is required to achieve the best value for money.
John Abbott, Founder & Chief Analyst, The 451 Group is leading the keynote on “Break Point: Disaster Recovery(DR) & The New Availability”, at Storage Expo, 14th – 15th October, Olympia, London www.storage-expo.com