What’s going on in your infrastructure – what APM tools aren’t seeing

In the digital age, many organisations struggle to effectively manage and optimise their hybrid data centres: complex fluctuating environments consisting of new and legacy systems, operating in silos that do not naturally communicate or interoperate well.

Driven by economic and competitive factors, IT executives must maximise agility and minimise complexity, which involves harmonising the many disparate elements of their on-premises infrastructure and public and private cloud environments, in order to keep up with fast-changing business strategies – a potentially daunting task.

The app’s the thing

The interconnected ecosystem within the hybrid infrastructure is host to thousands of different applications, all residing within layers of compute, networking, database and storage. For businesses the focus is on the applications supporting the critical processes whilst constantly improving the end user experience. Each IT silo is typically focused only on the performance of its own area, which creates a misalignment between ITs objectives and those of the line of business. This makes it exceedingly difficult for organisations to manage applications on their own within the context of the infrastructure supporting them.

>See also: Is application performance management missing in the DevOps toolchain

It is the critical applications that keep the business running and, because everything is connected, the applications themselves often compete for infrastructure resources such as bandwidth, storage ports, CPU cycles and other networked resources. This can cause a knock-on, or ‘noisy neighbour’ effect that can negatively impact the performance of other applications.

This also affects the IT infrastructure itself, depending upon the business’ workload demands at any given time. The noisy-neighbour effect has been the cause of many latency issues and ‘slow-downs’ affecting day-to-day operations, and much worse, cause outages that have great financial cost to organisations. The result is a detrimental impact to business continuity, brand reputation and customer loyalty.

Go deep, go wide or go home

In an effort to mitigate risks and keep an eye on their applications, organisations deploy application performance management (APM) tools, entrusted with monitoring and managing application performance and availability. These APM tools keep IT teams informed of various useful metrics, such as uptime, software errors, transaction speeds, traffic statistics, processing, etc.

There’s no denying that the data generated from these powerful tools is useful, but they all lack continuous real-time monitoring of application workloads across the data centre infrastructure. This makes it impossible to get a full picture of what is really going on, where every millisecond of latency impacts negatively upon the end user experience, and ultimately on the business.

>See also: Addressing the incredible complexity of the modern data centre

Worryingly, organisations relying on APM tools may be unaware of just how much vital information is missing from the data provided by these products. APM tools can see application behaviour, report on an issue and sometimes point out its location, but they are only capable of diagnosing performance issues from one perspective: from the end user to the top of the data centre.

When it comes to the other factors that can impact performance, APM tools are completely blind as to anything below the VM (virtual machine) layer. So if there’s an issue within the infrastructure beneath the VM layer that’s impacting application response times, IT teams will be able to see that something is affecting the application’s performance but they will not be able to identify what’s causing it and resolve it quickly – leaving the business exposed.

APM tools are also expensive to licence, so they tend to only be deployed to monitor critical applications, which may be only 10% of total applications. If, in a shared environment, an unmonitored application causes issues then the APM tool can see a slowdown, but not the cause.

This was precisely the issue we saw with a large UK financial services company, a tier one online banking application was slowing down and almost grinding to halt for a few minutes at the same time every morning, negatively affecting customer online transactions. The IT team was unable to see or locate the issue, as it was outside of the APM tool’s monitoring capability.

>See also: APIs: Please use responsibly

An application-centric infrastructure performance management platform was installed and it quickly discovered that the problem was caused by a tier three batch job, running on the production server, that had impacted the entire production environment.

Considering that in the banking world, application performance is viewed at microsecond levels, when key customer applications are experiencing latency of several minutes – it is a substantial problem.

Get the big picture

App-centric infrastructure performance management (app-centric IPM) complements APM tools to provide the full picture of the entire infrastructure, end-to-end, in real-time. An app-centric IPM solution will non-intrusively optimise the performance and availability of applications by measuring and correlating actual workload behavior from the VM, physical server and switch fabric right through to the storage layer, providing visibility of the entire stack.

This capability empowers application owners with the insights they need on how key infrastructure metrics are affecting application performance and availability, allowing a shared view and common understanding of how applications use (or cause stress to) the infrastructure.

>See also: Understanding application workloads for more agile, innovative IT

Most application performance issues are caused by changes at the application or database layer that the infrastructure is unable adapt to. The app- centric IPM approach creates a map of the data centre in relation to each application. It enables that vital view of the IT infrastructure in the context of the applications, knowing which are the most important to the business so that service levels can be set against them.

With this missing piece of the puzzle, organisations can better embrace digital transformation and agility, as they will be in prime position to guarantee the harmony and performance of applications and infrastructure as a whole, as well as harness vital intelligence to aid the business with effective decision-making and long-term planning.


Sourced by Sean O’Donnell, managing director, EMEA, Virtual Instruments

Avatar photo

Nick Ismail

Nick Ismail is a former editor for Information Age (from 2018 to 2022) before moving on to become Global Head of Brand Journalism at HCLTech. He has a particular interest in smart technologies, AI and...