In this digital data-driven world, system downtime poses a significant threat towards an organisation’s business operations. Analyst house Aberdeen estimates the average cost of downtime is well over £100,000 per hour.
As a result, reducing system failure and the risk associated with IT has moved high up on the agenda. Business continuity and disaster recovery plans should be in place, regardless of an organisation's size. These are there as a last resort, but not having a plan to handle downtime could create a bigger headache than the downtime itself.
Whilst it is widely accepted that downtime in some form will sometimes happen, it is important that efforts are made to reduce it. No matter how good your disaster recovery plans may be, most organisations would rather never have to use them.
When critical servers go down, the risk of losing data is significantly increased – documents, data, communications and information can all disappear. Organisations can be left unable to return to normal capacity, with key information missing.
Despite what many users think, in some cases it isn’t always the fault of IT when things go wrong. Downtime can be caused by issues as simple as overloaded servers, so making sure they are properly provisioned and maintained needs to be high up on the agenda.
Of course, no IT team can guarantee 100% availability and zero downtime, but getting as close as possible is essential, and there are several steps organisations can take.
1. Check SLAs of key vendors and partners
Checking the service level agreements (SLAs) of all key vendors and partners your organisation is using is a really simple place to start. Any good enterprise IT product will come with an agreed level of availability, and if this doesn’t meet business needs, alternatives should be explored.
There has been a lot of talk around achieving ‘five nines’ (99.999% availability), a maximum of five minutes 32 seconds of downtime a year, but organisations can’t expect this if they are using technology with low SLAs of just 90% uptime.
This can be improved upon by implementing active-active clustering, but companies should look to achieve the maximum uptime possible.
2. Use active-active clustering for all core IT systems
Active-active clusters help to balance server workloads across different networks, reducing the risk of downtime by minimising overloads. There are some organisations also using active-passive clusters, but these involve a more expensive hardware-heavy approach, reliant on a set of redundant servers that only come online when a primary system fails.
Active-passive clusters also tend to be much less secure, with survey respondents running passive environments reporting 34% more data and critical emails lost than those respondents who have active-active clustering environments.
3. Deploy load-balancing and highly scalable infrastructures
The ability to scale and balance workloads across multiple servers is required to ensure fast and efficient business transactions. Load-balancing means when one node is unavailable, such as when it is performing multiple actions on a file, other nodes can continue to process files and respond to requests.
Building an infrastructure that is quick and easy to scale will pay massive dividends, as it will help to keep your organisation ahead of its own growth and demand for IT.
Even when downtime is planned, it can still be a hindrance to productivity and costly to an organisation. Therefore, it is system critical for an organisation to ensure that infrastructures are firstly balanced to cope with demand, and secondly easily scalable, reducing the need for any planned downtime.
4. Rely on a single, managed file transfer solution
Using multiple file servers can create system-to-system integration vulnerabilities, heightening a company’s risk of downtime, glitches and security breaches. Managed file transfer dramatically reduces the risk of downtime posed by data breaches whilst allowing flexible file sharing with clients, partners and other staff.
If an organisation uses multiple vendors for one or a few similar services, it can be more difficult for all parties involved to spot potential vulnerabilities, and for vendors to assist with the right support.
Using the same vendor can really aid reducing downtime: systems and processes across the organisation will be compatible, not competing, and vulnerabilities will be much more easily spotted as a result of simpler networks.
In short, downtime is never going to be removed as a threat to business continuity, but it can be reduced. The high value placed on IT systems today is making downtime a growing concern.
Processes such as banking, communications, sales, and correspondence have all been digitised, and as a result there is an increasing reliance on IT and demand for always-on services.
Reducing downtime doesn’t have to break the budget, but the right tools need to be in place. Active-active clustering should be used, and servers should be built to load-balance and be highly scalable from day one.
Equally, checking SLAs and using file transfer technology from one vendor could even save money. Taking these four steps will help to increase uptime, and ultimately better maintain business continuity.
Sourced from Matt Goulet, COO of Globalscape