How the United Airlines’ network glitch could’ve been avoided with simple network monitoring

In July this year a computer fault forced United Airlines to ground its flights in the US for the second time in a matter of weeks. The problem, it turned out, was a ‘network connectivity’ issue caused by a computer router malfunction.

The impact of the glitch proved significant for United, both from an operational and brand reputation perspective. The two-hour long issue caused delays to more than 90 aircraft, resulting in the company once again hitting the headlines for all the wrong reasons.

Following hot on the heels of an earlier scenario in June which resulted in the airline enforcing a short flight ban after incorrect data appeared in its flight planning system, this second incident ended up costing the firm dear as its shares fell more than 1.5% in the day’s trade.

Network and application performance defines your reputation

It may seem surprising that something as simple as a router malfunction was enough to derail operations at United and generate serious financial and operational consequences.

> See also: United Airlines in meltdown as IT glitch leaves ALL flights grounded

But today’s networks are constantly evolving and becoming ever more complex – incorporating wired, wireless, physical, cloud, virtual, hosted, on-premise, and hybrid systems and applications. What’s more, IT departments are struggling to cope with escalating IT security and regulatory compliance demands – while battling the challenge of hidden threats generated by BYOD, rogue and non-sanctioned devices.

So, while the United Airlines story demonstrates just how dramatically things can go wrong in a moment, it does put the spotlight on the bigger question of just how well prepared the majority of organisations are when it comes to assuring network and application performance.

It also highlights why efficient and effective network and application monitoring tools are becoming a ‘must have’ in a world where the non availability of a network frequently equates to ‘no business’ – and results in frustrated customers, disgruntled employees and perplexed partners.

Downtime is not an option

Today’s IT pros are being tasked with the increasingly difficult job of keeping their organisation’s network running effectively and efficiently. Ensuring business continuity is the name of the game – because the real cost of downtime is crippling.

According to Dunn & Bradstreet, the productivity impact of downtime alone is estimated at more than $46 million per year for a Fortune 500 enterprise. And while the exact hourly cost of downtime for a midsized business may be lower, the proportional impact is much larger. As organisations continue to automate and depend on the network to get business done, the availability and performance of critical systems is a deal breaker.

In today’s non-stop hyper-connected world, systems must be up 24/7. And if IT can’t pinpoint problems fast, the business impact can be crippling. To minimise risk and the cost of downtime, IT teams need to be able to mitigate issues before users are impacted, rapidly finding and fixing any problems that do occur.

Are you monitoring the entire network?

With IT complexity growing every day, at an almost exponential rate, current strategies, resources and personnel may not be enough to keep pace. With the IT team juggling reduced headcounts and/or budgets – while being tasked with delivering more for less – the pressure is on like never before to continuously monitor and manage every aspect of the network.

Yet recently, at the Cisco Live event in San Diego, delegates were asked ‘Who here is monitoring their entire network’? Several people asked for clarification, not believing we were talking about every device, every port on a 48-port switch, and every last interface on a server.

Of the 800 people at the event, only five were able to confirm that yes, they were monitoring their entire network. That’s less than one percent – food for thought, indeed.

One thing is clear. IT departments simply can’t afford to make the slow trudge through an entire suite of solutions hoping to find the root cause of a problem. With 70% of organisations reporting in a recent survey that a critical network event took at least one business day to resolve, achieving a comprehensive real-time view of network and server performance, availability and health has never been more important.

Unified monitoring – proactive, fast, and easy to use

As the pressure mounts on IT to meet availability targets, automated and unified monitoring becomes essential as a means to understand, monitor and inform IT teams about a network’s makeup, health, and potential and actual problems.

> See also: What the NYSE, United Airlines and WSJ glitches show us about mitigating the cost of downtime

Today’s IT teams need tools that solve real problems, install easily and don’t require huge teams of experts to configure. Making it easy to get up and running in hours – not days or weeks – and quickly discover the network and its dependencies. These tools need to deliver real-time monitoring and early warning alerts, so IT can respond fast before a minor issue escalates into a full blown problem.

In the case of United Airlines, where a single router brought operations to entire halt, a review of the airline’s network infrastructure, monitoring tools and management should help to identify where the original issue lay.

But other organisations would do well to consider how well they’re tackling the growing challenge of keeping their network running efficiently and effectively.

Sourced from Alessandro Porro, VP of International, Ipswitch

Avatar photo

Ben Rossi

Ben was Vitesse Media's editorial director, leading content creation and editorial strategy across all Vitesse products, including its market-leading B2B and consumer magazines, websites, research and...