How can you make critical IT events less critical?

How can you improve operational efficiency to nullify or better deal with critical IT events?

Guaranteed uptime. Five nines. Always-on. A lot is expected from technology and by-and-large, it delivers.

But even if your digital estate is running to five nines, there’s still the 0.001% chance of IT failure, which has a cost.

The cost can be pounds on an invoice or a reduction in income.

The cost can also be less obviously tangible through reputational loss, for instance, which runs to an average of €88,000 per event.

These losses have made downtime a bigger concern than security for businesses today – according to Quocirca.

Quocirca recently took a look at critical I.T. events (CIEs) and how they affect businesses, because it’s a pertinent issue right now.

Hybrid cloud has made infrastructures more complex, while companies and consumers are both more reliant on technology. This makes CIEs more critical than ever before.

Whether it’s an electricity company suffering a power outage from faulty hardware, a bank undergoing an in-app fault that stops customers from paying bills or an email provider experiencing service downtime, these events are increasingly in the public eye.

>See also: Average European firm loses £3M per year from critical IT events

CIEs do happen and anyone can fall victim to them. In fact, on average, companies will experience two to three per month. But the real test is not whether you get knocked down, it’s how quickly you get back up.

The time it takes to get back up – or the mean time to repair (MTTR) – for these critical events is almost seven hours on average and takes a team of 18 people.

This is the first tangible cost, on average €27,000 per CIE in IT alone (this cost is totally separate to the €88,000 ‘cost to business’).

Companies are, of course, aiming to bring down the MTTR, but even a significant reduction still leaves a cost.

Reputational loss across the year accounts for an average loss of €4,140,000 in annual CIE-related costs. That’s difficult to simply brush off.

The reason that businesses are more concerned about CIEs than security is that these events aren’t going away. So they need to be able to bounce back quicker, to reduce how ‘critical’ they are and minimise the impact on the business.

Performing root cause analysis (RCA), as many companies already do, is vital to reducing the likelihood that the same event will happen in future, albeit time-consuming at 12.5 hours per event.

There are some barriers in the way though. For instance, although mobile devices are very active within every enterprise, they’re the devices where companies have the least visibility.

This makes RCA much tougher to achieve.

>See also: IT disaster recovery: flooding lessons learned

Organisations with higher levels of operational intelligence will be able to capture insights from all data sources across the business.

This visibility reduces MTTR and RCA time, and it also heightens detection of future events and reduces how critical they become.

As infrastructure elements vary, different skills need to be activated in response to different events.

Bringing these skills together quickly is a challenge, but arming them with total visibility of the infrastructure right away helps ensure success.

These CIE teams can resolve and analyse these unplanned events quickly and bring the focus of IT back to innovation and driving value within the business.

 

Sourced by Matt Davies, senior director at Splunk

 

>See Also: Tech Events Diary

Avatar photo

Nick Ismail

Nick Ismail is a former editor for Information Age (from 2018 to 2022) before moving on to become Global Head of Brand Journalism at HCLTech. He has a particular interest in smart technologies, AI and...

Related Topics

Hybrid Cloud