Addressing the incredible complexity of the modern data centre

The hybrid data centre environment is made up of thousands of applications, supported by many hundreds of infrastructure components and devices all sharing the same environment.

The uncorrelated elements of silo-specific tools, networks, servers, switches and the cloud in the mix, all contribute to the creation of tremendous blind spots when it comes to assuring the performance of applications.

Equally, the senior management of larger scale businesses require everything to run as expected and to be as cost effective as possible.

IT underpins the business and yet year after year, its teams are expected to do more with fewer resources and a decreased IT budget.

All of these factors have resulted in enormous pressure on IT teams and the infrastructure to perform. Guaranteeing the performance of the infrastructure that supports business-critical applications amidst all of these varying factors is tricky, at best.

The evolving technology landscape

The data centre has transformed so much in the last twenty years that old perceptions and methodologies are no longer relevant.

If the recent British Airways outage (among others), has shown us anything, it has revealed the vulnerability of the infrastructure, which can be affected by a number of varying factors causing latency, bottlenecks in performance, or worse, outages.

Rather than acting in isolation, applications can be affected by other unseen or unknown factors within the infrastructure, which negatively affects the business and the customer.

Clearly, there is a consequential requirement for a change in priorities – deeper infrastructure insights and planning is urgently needed to ensure that these costly, detrimental incidents don’t keep happening.

Cloudy with a chance of outages

Many organisations have sought relief from the challenges of complexity and exploding data by outsourcing and seeking solace in the cloud. But the cloud is essentially just another data centre, and at its heart are the same applications – subject to the same latency and stresses as an on-site data centre.

According to new research by the Enterprise Strategy Group (ESG)*, nearly two-thirds of enterprises have repatriated some applications from the cloud back to on-premises. People are realising that a more strategic approach with regards to what goes to the cloud is required.

Lack of performance is a key reason for cloud repatriation and application slow downs or outages have serious implications for the business.

The fact that cloud providers do not typically offer performance-based SLAs to their customers reveals a glaring shortfall that requires attention.

Slow applications may be removed from the public cloud and placed either in a physical data centre onsite or private cloud.

To help offset the risk of performance problems, there are vendors offering proprietary monitoring to alert IT teams to any issues. But the fact is, conventional monitoring doesn’t have the holistic capability to find and eliminate the underlying cause of all problems.

Whether in the cloud or on-premises, a shift in priority and focus on monitoring from the point of view of the application, is required and is a crucial element in guaranteeing the required level of infrastructure performance.

This is the approach of application-centric infrastructure performance monitoring (app-centric IPM), which highlights the importance of strategically identifying your business critical applications, knowing where those applications live in the infrastructure and creating clarity through what is essentially, a map of the data centre.

Applications “live” across the layers of compute, network and storage environments, so application-centricity is essential to this approach as an infrastructure performance monitoring (IPM) solution will do a much better job, than conventional tools, if it can see, correlate, and act across all these environments.

Rather than monitor a vendor specific application, the app-centric IPM approach goes further: to embrace an understanding of business application workloads within the context of the applications to ensure the service levels expected are met.

An app-centric IPM approach means putting together a picture of how the applications behave with an understanding of how those applications’ behaviour changes have a knock-on effect on other areas and how they add stress to various components in the infrastructure within particular timescales.

Shiny new things and blind faith in performance

Typical examples of this knock-on effect are the many challenges arising from the integration of new technology.

Can you comprehensively answer the question: “how will this new storage deployment impact application performance?” How do you truly know that your new purchase will work as it should?

According to the ESG survey, only about “40% of enterprise storage managers profile their workloads before buying storage”, which means that 60% of storage managers put their businesses at risk and blindly trust their vendors or VARs to deliver on the performance promise.

In addition, only 40% of enterprise storage mangers load test storage systems before purchase and deployment, again putting their businesses at risk.

It is human nature to rush out and buy the next shiny new thing, but that is not necessarily the best policy – you need to understand how the new deployment will fit within the infrastructure and perform with your actual business application workloads, rather than in a generic test lab.

Not all organisations are the same and not all workloads are the same. Vendors’ promises are powerless if there’s a hidden issue.

New procurements don’t work in isolation, they must be properly integrated into an ecosystem that is susceptible to changes and is ultimately affected by everything else. This is precisely why visibility of the entire infrastructure is so important.

It factors in the benefits from the correlation and contextual insights of application behaviour within the bespoke environment.

This highlights the importance of taking precautions and testing for assurance, utilising the intelligence of correlating analytics to help make accurate purchasing decisions and avoid costly over-provisioning.

Noisy neighbours

Another symptom of issues prevalent in today’s hybrid data centre is the concept of the noisy neighbour, an all too common example of the knock-on effect of a multi-tenant, shared environment.

A noisy neighbour is an application that shares bandwidth or other networked resources within the infrastructure with other applications.

When that application is in high demand it can negatively affect the performance of other applications.

For example, what would be the outcome and impact on the customers and just-in-time supply chain of a manufacturer if someone in marketing were to launch a new application to 100,000 potential customers using the same IT infrastructure?

It would cause a massive slow-down and the IT teams would have no idea why or where the issue came from.

Application performance monitoring (APM) tools would not be able to see this problem as, due to their cost, they only monitor critical applications – and the marketing app may not be regarded as critical.

The storage administrators may have put monitoring tools in place, but this will not be able to diagnose where the slowdown is coming from in this instance, as with traditional tools’ limited visibility, only the critical applications are monitored and it would only be capable of reporting that the manufacturing systems are slowing down.

However, if that same IT team had adopted an app-centric IPM approach which covers the entire infrastructure, it would not only have been able to rapidly locate the cause of the noisy neighbour issue, it could have enabled the reallocation of resources and avoided it altogether.

The proof is in the performance

These all too common issues affecting data centres desperately require a shift in priorities to solve them. The truth about latency is that response times that go down by 70-95% are often just as bad as real outages.

As times have changed, so have our perceptions of service continuity. Even a minor slowdown caused by a bottleneck in system performance will be perceived by customers as an outage, resulting in lost business.

And not forgetting the impact on brand reputation: today, everyone engages with businesses online or via mobile apps.

Whether end users are booking a flight, placing an order for delivery, or transferring funds through a bank transaction, they expect that interaction to work seamlessly, and if it doesn’t, or worse, it goes down completely, confidence in that organisation is quickly eroded.

Here a key customer facing capability has failed and people will think twice before using your product or service. The perception is, if you aren’t servicing your IT systems properly then how can you service your customers?

Forewarned is forearmed

An app-centric IPM approach substantially improves the relationship between applications and the infrastructure, therefore guaranteeing application performance to the business and end users. It is the priority shift required to make organisations future ready.

The data centre is a highly complex environment, there will be applications that can withstand performance problems, and some which cannot, and this kind of thorough approach is the only accurate way of truly telling the difference.

In addition, the ability to deliver analytics on precisely what is going on within the data centre will enable IT to prove its value up the chain to the rest of the business.

The detailed metrics provided by an app-centric IPM approach can provide key intelligence to all levels of the organisation including IT teams, CIOs, CFOs, application managers and storage administrators.

It removes the guessing and the finger pointing and gives all departments a common view, no quibbling.

Confucius said, “The expectations of life depend upon diligence; the mechanic that would perfect his work must first sharpen his tools.” And he was right, we must sharpen our tools, shift our priorities and hone our approach to suit the issues at hand to truly achieve a future ready hybrid data centre.

Armed with the insights and real-time metrics available on the entire infrastructure, executives and IT teams alike can have far greater visibility as to any infrastructure vulnerabilities.

With organisations being more informed, they are better able to guarantee performance and make accurate IT decisions aligned to the specific application workloads and needs of the business.

*The ESG research is based on 412 completed online surveys with storage professionals in multiple industry verticals across midmarket and enterprise organisations throughout Europe (UK, France, Germany, Italy, Russia, Belgium and the Netherlands)

Sourced from Sean O’Donnell, managing director EMEA, Virtual Instruments

Nick Ismail

Nick Ismail is a former editor for Information Age (from 2018 to 2022) before moving on to become Global Head of Brand Journalism at HCLTech. He has a particular interest in smart technologies, AI and... More by Nick Ismail

Addressing the incredible complexity of the modern data centre

The modern, heterogeneous data centre has changed dramatically from what it was in terms of IT advancements, complexity and scale.

The evolving technology landscape

Cloudy with a chance of outages

Shiny new things and blind faith in performance

Noisy neighbours

The proof is in the performance

Forewarned is forearmed

Nick Ismail

Related Topics

Related Stories

Data storage problems and how to fix them

Combining Qumulo integration with open source backup software

Combining block, file and object storage in one cluster technology

Overcoming data loss from embedded devices

Related Stories

Why data observability is the missing layer of modern networking

Is subscription-based networking the future?

Why and how to craft an effective hyperscale cloud exit strategy

Why cloud computing is losing favour