The iceberg organisation

Jonah Kowall, AppDynamic, discusses how enterprises can uncover their iceberg infrastructure to drive innovation and protect their business.

A post-it note with ‘DO NOT TURN OFF!’ scribbled across its face is silhouetted on a computer, in a dark corner of an office. Employees, IT teams, apps and systems have come and gone since the note was put up. Today, no one quite knows what it’s running but no one wants to be the one that pulls the plug.

Almost every Enterprise organisation has a piece of infrastructure or application componentry like that. In any given enterprise, IT infrastructure is an iceberg. There is a known portion sitting above the surface, visible and understood, yet below the waterline exists the full mass; unknown, unseen, with passageways and faultlines impossible to find and fix when errors occur. Today, very few organisations are able to map their IT infrastructure with any accuracy. With the various silos and teams within an organisation, there is a wide perspective. Each will see a different portion of the infrastructure glittering in the sun, whilst the majority remains submerged and unknown. Layers of these icebergs are formed over years – through legacy infrastructure and applications, partial migrations and projects, obscure and undocumented updates or builds, acquisitions, divestitures, abandoned projects, or internal experts moving on. The cost of these icebergs can be catastrophic but is often hard to measure or understand.

The ever-present mainframe

The mainframe is one of the largest underwater masses in the iceberg organisation. Running the most critical systems and tended to by a specialist team, very few see into the mainframe. The age of big iron is far from over, but as many specialists retire the skills gap is becoming a significant challenge for maintenance, development and visibility. These experts have become a precious resource which needs to be used intelligently.

>See also: The bigger picture: the right mainframe talent for the workplace

Some people may be surprised to learn that they interact with a mainframe every day, although these critical stalwarts are still powering the planet and for good reason too. According to IBM, mainframes run 68% of the world’s production workloads at 6% of total IT costs. Mainframes are a critical piece of IT infrastructure for the most powerful industries around the globe, with 44 of the top 50 banks, ten of the top insurance companies, and 18 of the top 25 retailers relying on IBM’s mainframe systems. Mainframe computing is built to run the kinds of secure, reliable and scalable experiences demanded by modern consumers. But with critical use cases, comes a need for rapid detection and resolution of issues. 24 hours is an awfully long time for a banking system to go down. Every second between problem and resolution counts.

The tip of the iceberg

As enterprises build out new digital offerings and transform their infrastructure the main frame will often have several new technologies and applications layered upon it. This layering creates even more complexity, and with this complexity, it becomes more difficult and more time-consuming to troubleshoot issues. Today’s applications are increasingly distributed and as a result are incredibly interdependent. A problem that takes two minutes to fix might take two hours to track down. Teams organised in clumps around their technology specialism find themselves ill-prepared to tackle enterprise-wide problems. Whilst an issue might manifest itself in the telemetry of one team, the root cause could sit with one, or several others. Attempts to problem solve rely on collaboration between siloed teams, each with no visibility of the overall IT infrastructure. Mean Time to Resolution rapidly becomes Mean Time to Innocence in the war room setting as each team brandishes different reports from different tools in a bid to prove that their team or technology is not to blame. Each team focuses on their different area of responsibility and remains blind to the broader context. The mainframe is one of the greatest siloes of all, regularly overlooked by cross-functional DevOps teams and forgotten, despite its critical role.

>See also: Why the mainframe still has its place in the modern enterprise

Mainframes may be cost-efficient to run, but with a lack of visibility what is the cost for enterprises? Mean Time to Resolution is extended and innovation is slowed to a halt. Critical experts are tied up in war room meetings, which might not even apply to their tech stack. Leaders of each IT team may be on-call throughout the night to resolve critical issues.

On average in the past year, organisations experienced four business-impacting application disruptions per month. For these disruptions, it took on average 10 hours to discover the root cause of the disruption and around 20 hours to fully resolve it. The stakes are too high to continue along these tracks, ignoring the icebergs.

What lies beneath?

Today’s teams are isolated by silos and disparate tools. Worse still, there is little visibility of the mainframe in most monitoring systems. Exposing the iceberg and eliminating the incident war room is the fastest path to efficiency and that starts with shared insight.

Clear, real-time mapping of the critical business transactions used to measure business outcomes all the way to the mainframe is critical. This kind of transparency is foundational to the cultural and operational shifts of a modern enterprise. IT has always fixed problems, but with better insight, it can prevent them too. A single pane of glass, that can dive all the way down to individual assets and lines of activity can become the central nervous system of the enterprise. Leveraging machine learning within these telemetry tools allows for the identification of problems as service degrades, versus waiting until users complain of business affecting issues. Clarity and transparency around the user’s journey, for all teams, paves the way not only for problem-solving but innovation and problem prevention.

>See also: Siloed thinking will lead to siloed digital transformation strategies

By correlating telemetry collected from applications and infrastructure, with the transactional events that make up a customer journey (such as add to basket, searches or other customer interactions) you can build an understanding of how users are interacting with your business systems. Identifying problematic performance issues or slow response times is easy when you can measure business impact. Baselining each collected metric along with each important business transaction can help proactively identify problems, and what they affect, without bundling everyone into a room. This way you can engage only the experts you need in order to address and resolve the problem quickly. This frees up already overwhelmed and overburdened experts and critical resources for more valuable pursuits to drive innovation or better protect the business.

Iceberg organisations are commonplace but not irreparable. As application complexity increases, the first step for many will be to know their business better. If the phrase ‘Mean Time to Innocence’ caused you a wry smile, you cannot afford to delay. With the right tools and processes, businesses can uncover the iceberg and repurpose assets, skills and information to better serve the business.

Sourced by Jonah Kowall, Vice President of Market Development and Insights at AppDynamic

Avatar photo

Andrew Ross

As a reporter with Information Age, Andrew Ross writes articles for technology leaders; helping them manage business critical issues both for today and in the future

Related Topics

IT Infrastructure