Lessons learned from an IT failure

Society’s reliance on the internet and particularly on the web is such that people simply can’t think of life without it. People now shop, communicate, enjoy entertainment, share information, and bank online. If a website or cloud service goes down, it doesn’t take long before its users or customers are aware, and can manifest as headline news.

Whether it is a deluge of customers on a landmark shopping day, the impact of a cyber attack, or something simple like an engineer accidentally unplugging the wrong machine, it doesn’t take much to take down a web site if the right measures aren’t in place to cope with demand or to cope with failure.

Publishers, retailers and other website operators need to have measures in place to maintain quality of service (QoS) as well as continuity of service during times of high demand or service disruption.

>See also: Another day, another deadline making IT outage

At this point in time, can you say that your web site will stay accessible and responsive to your audience 24/7?

There are simple steps that an organisation can take immediately to maintain site and service performance, and to preserve business continuity in the event of an outage:

1) Adequate resources for your web site

The first thing to do is ensure you have the right resources in place to deliver your web site and its content to your target audience. Can your web site scale to deal with the number of users or customers you can potentially attract?

If not you need to add web site capacity, or at least put in place other measures to handle and deliver traffic that can scale to cope with spikes in demand. Will your content management system or web platform code cope with a sudden upward shift in transactions or particular demand for one page? Again, if the current platform can’t cope with a sudden influx of transactions and requests that a milestone incident can bring, use a quiet period to implement a platform change or upgrade.

Also, do users connect directly to one location to access your web site, or do you use a delivery platform to ensure that data and other site traffic is distributed effectively worldwide? The latter is an option to implement if you are not already using one.

>See also: The week of social media and communication outages

Delivery platforms use infrastructure that can handle large spikes, can cope with the impact of trading events like Black Friday, the launch of popular streaming content or around a moment of breaking news. In the meantime, the deluge of traffic is kept at arm’s length from the main site, preventing it from being overwhelmed or compromised.

2) Backups for your backups

It might seem obvious, but this is an area that continues to be overlooked by many organisations, especially when it comes to their web sites. Especially if your web site is your primary shop front or point of customer interaction, you can’t afford for it to be down, let alone for data to be lost.

Test backups regularly. Don’t rely on a single backup platform. Always hedge your bets and maintain at least two types of backup. Ideally, one on-site and one off-site. This will also help with business continuity by making it easier to access and restore data and services in the event of an issue. Off-site web site mirrors also make it easier to switch over an application performance or delivery platform to a backup of your site. This will ensure seamless continuity of customer experience in the event the primary website location goes down.

3) Business continuity measures

A clear and well-rehearsed business continuity plan is essential for critical business systems, as is having the right partner services in place to maintain continuity of service in the event of something like a surge in demand, security issue or content problem.

This is one area where the role of a dedicated platform for delivering your website and its content to users can mitigate the impact of an outage.

>See also: Best-practice takeaways from internet outages

A content delivery platform provides built-in redundancy. It adds an important buffer between user and content owner that ensures the user still receives a high standard of service even while the owner attends to any technical issue on the back end.

In environments where 100% uptime is essential, such as retail, transport, banking and publishing, a global delivery system that won’t be derailed by one small technical glitch will retain customer confidence during a period of stress.

With the web and hosted applications, speed and continuity are key. If your site isn’t there, or takes an eternity to load, people will go somewhere else. Whether users are on a fixed connection or a mobile device, the expectation of service is the same.

For site operators and publishers, using the right tools and platforms to deliver content no matter what will ensure customer loyalty and confidence, and provide the time and breathing space to correct technical and demand issues when they arise.


Sourced by Andrew Bartlam, VP EMEA, Instart Logic


The UK’s largest conference for tech leadershipTechLeaders Summit, returns on 14 September with 40+ top execs signed up to speak about the challenges and opportunities surrounding the most disruptive innovations facing the enterprise today. Secure your place at this prestigious summit by registering here

Avatar photo

Nick Ismail

Nick Ismail is a former editor for Information Age (from 2018 to 2022) before moving on to become Global Head of Brand Journalism at HCLTech. He has a particular interest in smart technologies, AI and...

Related Topics

Cyber Attack
IT Outages