What’s really going on in your IT estate? What’s running in there?

Knowing for sure can be hard. A small hybrid datacenter might support hundreds of complex, fast-changing, business-critical applications; each with numerous interdependent, distributed software components; connections to shared resources (e.g., corporate databases); configuration facts; deployment and lifecycle management procedures; stakeholder information and other details. Many staffers may have access to cloud facilities letting them self-serve platforms and resources at will. No one person or team owns everything in the day-to-day.

That’s bad news for IT leaders who want to move faster while controlling costs — say by moving key workloads to the (private or public) cloud, consolidating to improve utilisation, and shifting expenditures from CapEx to OpEx. Many underutilised begin their cloud journey explicitly in search of cost savings; usually triggered by the realisation that most physical servers are painfully under-optimised (in “one app per box” regimes, working at only 6% to 12% of actual capacity).

As most quickly learn, however, utilisation of cloud-based VMs is only slightly better. The world’s most advanced cloud users (e.g., companies like Google and Twitter) estimate internal VM utilisation at around 50% of capacity. It’s a safe bet that most organisations are less efficient and are paying more for resources. Yet despite this inefficiency, VM sprawl keeps increasing. Gartner estimates that the installed base of Windows licenses has grown 77% over the past four years, almost all of these on VMs hosted in private and public clouds.

It’s unsurprising, therefore, that public cloud costs are still often much higher than projected. What’s shocking, however, is that 30% to 50% of that cost is wasted — on overprovisioned, abandoned, and “zombie” VMs (VMs that failed during boot, or were powered on but are not providing services), unused server reservations (a money-saving strategy gone bad through inattention), and other cruft.

See also: The value of visibility in your data centre – John Jainschigg, content strategy lead at Opsview, explains to Information Age the value of visibility in your data centre

With the fully-burdened (i.e., labour and other overheads included) cost of operating a single VM estimated by Gartner as around $6,000 USD/year, consolidation and waste elimination should clearly be an ongoing process. Your goal should be to always run the minimum set of resources required for good application performance and robust availability. Need additional motivation? Orphan VMs that aren’t updated or hardened regularly are prime targets for hacking: getting rid of these honeypots enormously reduces your attack surface and your potential liability.

But here’s the catch: how do you know which hosts are zombied, abandoned, or underutilized? Many organisations have no more efficient solution than to perform periodic, time-consuming, manual audits in an attempt to stay ahead of the problem.

Enterprise IT monitoring as a source of real-time truth

Flagging comatose and underutilised systems and efficiently negotiating (or compelling) their consolidation or termination requires a trustworthy source of detailed information about asset roles, interdependencies relating to delivery of business services, asset and business service ownership, and utilisation metrics. Enterprise IT monitoring can provide such a single source of truth — probably more efficiently than any other tool.

This doesn’t, of course, mean that Enterprise IT monitoring is the only way to skin this cat. IT Asset Management (ITAM) and Configuration Management Databases (CMDBs), for example, are designed to enable rigorous modelling of your IT environment, capture asset interdependencies and map business services in detail, and maintain rich information about requirements and ownership. Some CMDBs provide (or in some cases, integrate with) IT resource discovery frameworks to populate asset configurations partially, or integrate with Enterprise IT monitoring systems to derive or validate a picture of the current state of infrastructure.

Designed to assist with planning, large-scale change management, governance documentation, and similar tasks, in theory, it should also be possible to use a CMDB to identify at least some consolidation candidates. But CMDBs are designed to represent intention and encode standards for “the authorised configuration of the significant components of the IT environment” (see Wikipedia, Configuration Management Database), rather than provide live insights about reality in all its messy detail. CMDB users usually omit capturing the state of non-strategic and non-top-down-managed sectors of the IT estate, such as transitory dev/test or long-running legacy systems — both prime candidates for aggressive pruning. And while CMDBs can provide rich information about configuration, dependencies, and ownership of strategically important systems, they provide no utilisation metrics, so you still need monitoring for that.

See also: Automate everything or get left behind – John Jainschigg, content strategy lead at Opsview, explains to Information Age why it’s vital to treat monitoring as a part of DevOps automation

Why enterprise IT monitoring works better

That’s an obvious reason why Enterprise IT monitoring works better than CMDB as a stand-alone solution to this problem. But there are many other reasons, too:

For good and obvious reasons, (quite apart from cost-savings), monitoring should be applied everywhere. You need Enterprise IT monitoring to keep critical business services up and performing well, find and fix issues fast, proactively manage underlying infrastructure, meet SLOs and SLAs. That job is absolutely critical, and moreover, enables many kinds of cost-saving and risk mitigation arguably as valuable as continuous consolidation. A plus: the best IT monitoring solutions can scale to gracefully monitor even the largest IT estates.

It’s easy, therefore, to justify monitoring everything, and to insist on automatically implementing monitoring with every host, VM, or application you deploy. Doing this, moreover, is technically pretty easy, especially if your data centre operates on an “infrastructure as code” regime. Just add monitoring agent deployment to host deployment scripts/recipes/playbooks, and monitoring configuration (via REST API) to host, application, business service, and cloud service tooling. Enterprise monitoring solutions provide hundreds of pre-packaged monitoring templates to make configuration easier. Base templates for Linux and Windows hosts (bare metal, VM, or public-cloud hosted), storage systems and other low-level infrastructure come with preconfigured service checks measuring utilisation from CPU, RAM, storage, network, and other relevant perspectives.

Result: everything you deploy comes up monitored. With 100% visibility into your infrastructure, it becomes much easier to identify consolidation and pruning targets.

Enterprise IT monitoring solutions enable automated or manual definition of host groups, clusters, and large-scale business services. Once assets are configured as members of a business service, it’s much easier to understand interdependencies, project the impact of changes, interpret utilisation data, right-size VMs, and plan consolidations and related workload migrations. A big plus: being able to visualise resilient clusters and distributed service tiers helps you recognise situations where maintaining necessary redundancy takes priority over reducing server headcounts and upping utilisation.

Enterprise IT monitoring connects you naturally to appropriate technical and business stakeholders for each service and platform in your hybrid estate. To configure alerting and escalation properly, you need to understand ownership and accountability for systems in your charge. These are usually the people and teams you’ll need to inquire of and negotiate with to optimize utilisation or discuss taking unused assets offline.

Enterprise IT monitoring supports sophisticated reporting. Scheduling or pulling an ad-hoc utilisation or network traffic report is a great way of flagging obvious zombies and abandoned hosts.

Most important, Enterprise IT monitoring makes all this information readily discoverable under a single pane of glass; consumable and actionable by IT generalists. That means your cost-saving program can itself be conducted cost- and time-efficiently, for maximum payback.

See also: IT monitoring: Don’t monitor yourself into a madhouse – John Jainschigg, content strategy lead at Opsview, argues in Information Age that if done right, IT monitoring provides clarity and promotes operational effectiveness. Done wrong, however, it can make your staff crazy and limit business growth.

Written by John Jainschigg, Content Strategy Lead at Opsview

Editor's Choice

Editor's Choice consists of the best articles written by third parties and selected by our editors. You can contact us at timothy.adler at stubbenedge.com More by Editor's Choice

What’s really going on in your IT estate? What’s running in there?

John Jainschigg, Content Strategy Lead at Opsview, explains to Information Age why Enterprise IT monitoring works better than CMDB.

Enterprise IT monitoring as a source of real-time truth

Why enterprise IT monitoring works better

Editor's Choice

Related Topics

Related Stories

Andrew McAfee – ‘Human beings are chronically overconfident’

Keys to effective cybersecurity threat monitoring

How businesses can vet their cybersecurity vendors

Five key signs of a bad MSP relationship – and what to do about them

Related Stories

What does leadership in a hybrid world look like?

Future workers to work three-and-a-half day weeks, says JP Morgan chief

Five key steps towards a connected enterprise

CCI Kenya: why more companies are turning towards the BPO sector in East Africa