Knowing for sure can be hard. A small hybrid datacenter might support hundreds of complex, fast-changing, business-critical applications; each with numerous interdependent, distributed software components; connections to shared resources (e.g., corporate databases); configuration facts; deployment and lifecycle management procedures; stakeholder information and other details. Many staffers may have access to cloud facilities letting them self-serve platforms and resources at will. No one person or team owns everything in the day-to-day.
That’s bad news for IT leaders who want to move faster while controlling costs — say by moving key workloads to the (private or public) cloud, consolidating to improve utilisation, and shifting expenditures from CapEx to OpEx. Many underutilised begin their cloud journey explicitly in search of cost savings; usually triggered by the realisation that most physical servers are painfully under-optimised (in “one app per box” regimes, working at only 6% to 12% of actual capacity).
As most quickly learn, however, utilisation of cloud-based VMs is only slightly better. The world’s most advanced cloud users (e.g., companies like Google and Twitter) estimate internal VM utilisation at around 50% of capacity. It’s a safe bet that most organisations are less efficient and are paying more for resources. Yet despite this inefficiency, VM sprawl keeps increasing. Gartner estimates that the installed base of Windows licenses has grown 77% over the past four years, almost all of these on VMs hosted in private and public clouds.
It’s unsurprising, therefore, that public cloud costs are still often much higher than projected. What’s shocking, however, is that 30% to 50% of that cost is wasted — on overprovisioned, abandoned, and “zombie” VMs (VMs that failed during boot, or were powered on but are not providing services), unused server reservations (a money-saving strategy gone bad through inattention), and other cruft.
The value of visibility in your data centre
With the fully-burdened (i.e., labour and other overheads included) cost of operating a single VM estimated by Gartner as around $6,000 USD/year, consolidation and waste elimination should clearly be an ongoing process. Your goal should be to always run the minimum set of resources required for good application performance and robust availability. Need additional motivation? Orphan VMs that aren’t updated or hardened regularly are prime targets for hacking: getting rid of these honeypots enormously reduces your attack surface and your potential liability.
But here’s the catch: how do you know which hosts are zombied, abandoned, or underutilized? Many organisations have no more efficient solution than to perform periodic, time-consuming, manual audits in an attempt to stay ahead of the problem.
Enterprise IT monitoring as a source of real-time truth
Flagging comatose and underutilised systems and efficiently negotiating (or compelling) their consolidation or termination requires a trustworthy source of detailed information about asset roles, interdependencies relating to delivery of business services, asset and business service ownership, and utilisation metrics. Enterprise IT monitoring can provide such a single source of truth — probably more efficiently than any other tool.
This doesn’t, of course, mean that Enterprise IT monitoring is the only way to skin this cat. IT Asset Management (ITAM) and Configuration Management Databases (CMDBs), for example, are designed to enable rigorous modelling of your IT environment, capture asset interdependencies and map business services in detail, and maintain rich information about requirements and ownership. Some CMDBs provide (or in some cases, integrate with) IT resource discovery frameworks to populate asset configurations partially, or integrate with Enterprise IT monitoring systems to derive or validate a picture of the current state of infrastructure.
Designed to assist with planning, large-scale change management, governance documentation, and similar tasks, in theory, it should also be possible to use a CMDB to identify at least some consolidation candidates. But CMDBs are designed to represent intention and encode standards for “the authorised configuration of the significant components of the IT environment” (see Wikipedia, Configuration Management Database), rather than provide live insights about reality in all its messy detail. CMDB users usually omit capturing the state of non-strategic and non-top-down-managed sectors of the IT estate, such as transitory dev/test or long-running legacy systems — both prime candidates for aggressive pruning. And while CMDBs can provide rich information about configuration, dependencies, and ownership of strategically important systems, they provide no utilisation metrics, so you still need monitoring for that.
Automate everything or get left behind
Why enterprise IT monitoring works better
That’s an obvious reason why Enterprise IT monitoring works better than CMDB as a stand-alone solution to this problem. But there are many other reasons, too:
For good and obvious reasons, (quite apart from cost-savings), monitoring should be applied everywhere. You need Enterprise IT monitoring to keep critical business services up and performing well, find and fix issues fast, proactively manage underlying infrastructure, meet SLOs and SLAs. That job is absolutely critical, and moreover, enables many kinds of cost-saving and risk mitigation arguably as valuable as continuous consolidation. A plus: the best IT monitoring solutions can scale to gracefully monitor even the largest IT estates.
It’s easy, therefore, to justify monitoring everything, and to insist on automatically implementing monitoring with every host, VM, or application you deploy. Doing this, moreover, is technically pretty easy, especially if your data centre operates on an “infrastructure as code” regime. Just add monitoring agent deployment to host deployment scripts/recipes/playbooks, and monitoring configuration (via REST API) to host, application, business service, and cloud service tooling. Enterprise monitoring solutions provide hundreds of pre-packaged monitoring templates to make configuration easier. Base templates for Linux and Windows hosts (bare metal, VM, or public-cloud hosted), storage systems and other low-level infrastructure come with preconfigured service checks measuring utilisation from CPU, RAM, storage, network, and other relevant perspectives.
Result: everything you deploy comes up monitored. With 100% visibility into your infrastructure, it becomes much easier to identify consolidation and pruning targets.
Enterprise IT monitoring solutions enable automated or manual definition of host groups, clusters, and large-scale business services. Once assets are configured as members of a business service, it’s much easier to understand interdependencies, project the impact of changes, interpret utilisation data, right-size VMs, and plan consolidations and related workload migrations. A big plus: being able to visualise resilient clusters and distributed service tiers helps you recognise situations where maintaining necessary redundancy takes priority over reducing server headcounts and upping utilisation.
Enterprise IT monitoring connects you naturally to appropriate technical and business stakeholders for each service and platform in your hybrid estate. To configure alerting and escalation properly, you need to understand ownership and accountability for systems in your charge. These are usually the people and teams you’ll need to inquire of and negotiate with to optimize utilisation or discuss taking unused assets offline.
Enterprise IT monitoring supports sophisticated reporting. Scheduling or pulling an ad-hoc utilisation or network traffic report is a great way of flagging obvious zombies and abandoned hosts.
Most important, Enterprise IT monitoring makes all this information readily discoverable under a single pane of glass; consumable and actionable by IT generalists. That means your cost-saving program can itself be conducted cost- and time-efficiently, for maximum payback.
IT monitoring: Don’t monitor yourself into a madhouse
John Jainschigg, content strategy lead at Opsview, argues in Information Age that if done right, IT monitoring provides clarity and promotes operational effectiveness. Done wrong, however, it can make your staff crazy and limit business growth
Written by John Jainschigg, Content Strategy Lead at Opsview