Website outages and how to prevent them

Unplanned website outages are incredibly frustrating for any organisation. Even the shortest downtime can shatter an organisation’s reputation and cost a fortune in lost revenue, as scores of irritated customers take their business elsewhere. Yet although this age-old problem is entirely avoidable, it continues to affect organisations of all shapes and sizes. Why?

Computer says no

As Britain’s party of government and one of its most prominent political groups, the Conservative Party’s website is frequently visited. Yet earlier this year, visitors to encountered an alert message. The warning told them their intended destination might be suffering from security issues:

“Your connection is not private. Attackers might be trying to steal your information from (for example, passwords, messages or credit cards).”

The Conservatives are not alone here. In November last year, LinkedIn suffered a similar outage, as visitors across the US, UK, Canada, Australia and elsewhere were met with a bizarre ‘CERT_DATE_INVALID’ warning, rather than their social network. For the huge number of organisations and individuals across the world that rely on LinkedIn’s services for their own businesses, this was unacceptable. A five-minute outage would have been bad enough – an hour-long outage prompted something approaching a global outcry before LinkedIn was even able to assess the damage done to its reputation.

>See also: IT outages: The actual cost and how to prevent it becoming a reality

The common factor between these two incidents was that they both involved expired digital certificates. Every single website you visit relies on a unique identity, a digital certificate that tells browsers and devices that the website is trustworthy, and that your connection to it is encrypted. When this identity expires, the browser can’t guarantee that the website is secure. In many cases the browser or device will err on the side of caution; it will either warn users that the connection is unsafe or prevent access outright.

The value of trust

The simple solution to this problem is to renew the certificate before it expires, to ensure continuous service. However, this isn’t always as easy as it seems. Organisations rely on these identities for more than just websites, every machine – that is, every device, programme and application – needs an identity to show other machines that it’s trustworthy. These identities act as ‘passports’, allowing machines to know that the other machine they’re talking to is what it claims to be. Without a valid passport, there’s no way for machines to trust each other, and applications, websites, devices and programmes would simply cease to function. Most firms have thousands of machine identities in use, each with its own expiry date. Keeping track of every single one can be complicated.

>See also: The Internet of Things and the consequences of downtime …

A growing problem

While the process of replacing expired machine identities isn’t complicated, it’s understandable that outages of this nature continue to happen. Organisations typically have thousands of machine identities in use across the business. Keeping tabs on when every single one is due to expire isn’t something that can be tackled with a simple spreadsheet; indeed, many businesses have thousands that they’re not even aware of.
Moreover, reliance on them is increasing. As the digital transformation process rises up the agenda for most organisations, processes like DevOps, AI, or Internet of Things deployments have become increasingly popular, and all rely on machine identities to run. Complicating matters even further, each unknown machine identity can be a flashpoint for cybercrime, as hackers are increasingly able to exploit them in order to appear legitimate, bypass security defences and infiltrate networks.

>See also: Downtime costs money: how DevOps can help you fix applications …

Staying in control

All of this makes control over machine identity more important than ever. As the amount of machine identities in use skyrockets, organisations will become increasingly unable to manage them without an automated process, and the number of unplanned outages will rise. As such, every firm should be automating the discovery, management and replacement of every single machine identity they rely on, alongside automating the process of monitoring them for signs of misuse. Had the powers that be within the Conservative Party had this capability, they would have spotted an impending expiry, acted on it accordingly, and spared themselves the embarrassment of a website outage. Yet without this capability, a high-profile website outage may well be just the start of an organisation’s problems.

>See also: Disaster recovery: a necessity not a luxury 

By Marc Madison, Director, Professional Services at Venafi

Editor's Choice

Editor's Choice consists of the best articles written by third parties and selected by our editors. You can contact us at timothy.adler at