Autonomic for the people

The data centre is fast becoming a black hole for IT resources. The sheer number of applications, their supporting infrastructure and the tools to manage it all is manpower intensive. The well-paid staff that oversee the running of the data centre find their time taken up on routine tasks. It is a situation that wastes talents, threatens to strangle innovation and is inordinately inefficient.

Much of the current problem stems from the complexity of operations within the data centre; complexity that stems from the lack of interoperability between applications and the inherent difficulty in managing these incongruous systems.

IBM: technology and open standards

IBM made its initial forays into self-managing systems in 1999 with the eLiza initiative, which has subsequently evolved into its autonomic computing initiative. It positions itself as capable of introducing self-healing and self-managing technology into every level of the infrastructure, from processors to applications.

Essentially its autonomic initiative is an element of its overarching "On demand" strategy, but IBM has succeeded in aligning its many different units to work together more transparently. It now wants other vendors to do the same.

David Bartlett, IBM's director of autonomic computing, describes four key initiatives: creating an "overarching architecture" for autonomic systems, based on ITIL best-practice; driving interoperability standards; developing products so that they can be automated; and creating integration technologies to bring these all together.

Products released under the autonomic auspices include : the eFuse programmable processor, which can reroute traffic around inactive parts of a chip; and the latest DB2 Universal Database, which can tune itself in line with workload fluctuations, optimise queries and reroute to another server to keep the database running if hardware fails or during maintenance.

Yet it was only when this started to impact on IT suppliers' bottom line that they sought to do anything about it. "Everybody [in the vendor community] realised the complexity was threatening their customers' ability to buy more stuff," says Donna Scott, an analyst with research group Gartner.

That simple incentive has spurred management and infrastructure suppliers to find new ways to automate the mundanity of the data centre. IBM, a pioneer in this area, defined a set of features, of what it brands ‘autonomic' computing, that it predicts will revolutionise systems management: systems of the future will be self-configuring, self-protecting, self-optimising and self-healing. The term autonomic comes from the idea of bodily functions, such as breathing, being regulated without conscious direction.

Today, there are many systems management tools which allow for remote systems configuration, for example obviating the need to manually install new software or create new user accounts. Microsoft's Automatic Update installs patches for the Windows operating system without users having to remember to download them, adding a basic level of security. IT departments have set up virtual server farms – enabling several individual boxes to be used as one resource – and automatic software provisioning to enable more capacity to be added to a business service as demand increases.

But self-healing systems have largely remained in the testing labs. There is clearly an addressable market: estimates indicate that around 70% of an IT department's budget is spent maintaining the existing infrastructure; IBM research has found that 70% of problems are ‘repeatable' and so can be addressed programmatically. So where are the products?

Creating a self-healing system sounds deceptively simple: merely connect alerts and error reports to the management systems, and the manual intervention required to make basic fixes or deal with false alarms could be minimised. However, the technological challenge involves bridging more fundamental divides: between applications and infrastructure architects; between business units and IT; and between software vendors' product sets.

Science fiction

It also requires a shift in the way IT managers think about their systems. It is frequently human configuration that causes systems to malfunction, often the product of bored technicians working on repetitive tasks. But an autonomic system can cause problems when it thinks it is helping.

If all this sounds too much like science fiction, echoes of HAL 9000, the psychopathic self-aware computer in Kubrick's 2001: A Space Odyssey, Microsoft UK's security and management product marketing manager, Alfred Beeler describes how it can happen in practice. One customer bought an expensive RAID array to ensure a server would never go down.

Yet when its first hard drive failed, it was so good at failing over that nobody noticed until the second drive also failed. That took the entire service it was supporting offline. "There will still be a need for [manual] monitoring," says Beeler. "Even the best self-healing system could kill itself."

HP: the business perspective

Rather than emphasising the technological challenges of self-managing systems, Hewlett-Packard (HP) seeks to define business needs first, then automatically tune the IT infrastructure to support them.

The company has benefited from two acquisitions: Consera, which models the IT environment that is required to deliver a desired level of service, and Novadigm, which automates the provisioning and reprovisioning of technology to maintain a specified state. HP added this functionality to its OpenView Automation Manager, which can adjust the IT resources supporting a business process automatically once its components have been specified.

HP is attempting to build tools that react to the situation rather than simply automating actions based on scheduling, says Ian Curtis, HP's director of software strategy in the UK. "We are not so much automating the processes in the IT organisation per se but the process of linking those processes to the business environment."

But OpenView lacks the market penetration of other management packages like CA's Unicenter or IBM's Tivoli, and HP cannot claim ownership of the stack as Microsoft does. It has a coherent vision, but it remains to be seen if it can convince data centre managers.

Similarly, in the event of a denial of service attack, a well-meaning management system could just keep provisioning servers until a whole data centre was taken out, if it did not realise the demand was coming from an illegitimate source. This highlights the need for various management elements – like security and provisioning – to communicate with each other. "The end task is a system which can hide all the noise and the mundane tasks but when something more interesting happens it can flag it up," says Beeler.

Management software vendor BMC believes a starting point is to establish thresholds of behaviour. BMC recently released its Patrol Analytics tool which can establish a baseline of what constitutes normal fluctuations in server performance. Alarms are triggered only when the systems operate beyond those limits. "If you set a static threshold that sends an alarm whenever a server goes over 70% utilisation, every Monday morning you would get a false alarm [when utilisation goes up with everyone checking the weekend's emails]," says Kym Wood, BMC's business unit field manager for EMEA.

While hardly a revolution in self-healing systems, it is a good example of how making systems more intelligent can mean fewer people having to manually monitor server behaviour. "With the skill shortage in the IT industry, CIOs have to be careful how they deploy that intelligence," says Wood. But she warns that there is a fine line between usefully redeploying staff away from repetitive management tasks and automating to the point that errors could be introduced. "If a company automates too much it could introduce a major change into the IT infrastructure that would be impacting another service," she says.

In response to this, some infrastructure vendors are taking a wider-focused approach to automating IT management – by looking at business processes rather than technology. IBM, Computer Associates (CA), Hewlett-Packard (HP), Microsoft and Sychron all sell modelling software that allows organisations to map elements of the IT infrastructure to business processes – and then automate the management of the IT, ensuring service is maintained. IBM is also pushing an initiative aiming to map the dependencies between different elements of the infrastructure.

Conflict of interest

But there remains a fundamental problem at the heart of all the vendors' self-healing systems initiatives: they are too far ahead of their would-be customers. Gartner's Scott says that there are few organisations doing even the early stages of autonomic computing, such as the automatic provision of web servers in response to increased demand. The challenge is cultural, not technological; most businesses' IT processes are insufficiently mature to understand what they need to automate.

"If you don't have good controls you're not going to automate. If IT is not well managed it might be a multi-year project just to get to the base level," says Scott. The IT Infrastructure Library (ITIL) is the best service management model for IT departments aspiring to automation, she adds. Many vendors have now introduced its principles to their products.

Microsoft: fixing itself

The software giant takes a typically brash approach to defining the IT infrastructure of the future. "It is not going to be self-healing but self-managed to the point where it never gets ill," says Microsoft's UK security and management product marketing manager, Alfred Beeler.

Microsoft aims to change the way applications are developed. The latest release of its Visual Studio development suite enables .Net developers to produce management packs for home-grown applications, which do not always respond so well to management tools.

This is fundamental to Microsoft's Dynamic Systems Initiative (DSI), an effort to ensure that vendors make their components report effectively on their status to the rest of the environment. DSI has been supported by the likes CA, HP and Dell, but it concentrates on the Windows platform; other vendors' software is superficially managed.

Microsoft has also shown little willingness to pitch in to open standards initiatives launched by other vendors like IBM – a potential problem for larger enterprises with a heterogeneous environment. Nevertheless, Microsoft is pushing several standards of its own, and is working with others to ratify these standards.

Colin Bannister, a consulting manager at CA, says an accurate view of an organisation's assets, using a management database, is a fundamental prerequisite for a self-managing infrastructure that organisations can put in place today. He adds that achieving a full view and automating its management is facilitated by standardisation of the infrastructure, with fewer customised applications.

Ian Curtis, HP's UK director of software strategy, goes further, adding that "most large enterprises have a plethora of different management tools" which, he claims, they want to consolidate to take "a more strategic approach". This may mean ripping and replacing existing software.

In effect, what is proposed is an argument for companies to buy all their software from the same vendor, says Scott: "Really a lot of this is about lock-in. All software for business processes is lock-in and if it is really ingrained in management processes that causes lock-in as well."

But there may be hope that vendors have recognised user reluctance to get locked in to a single provider. "Looking back over the last 40 or 50 years, it's as if in the beginning the aim was to create as many different formats as we can – and we were wildly successful," says IBM's head of autonomic computing, David Bartlett. "To solve this problem this has to be more than a single vendor initiative."

A number of standards initiatives aim to solve the fundamental problem of managing a heterogeneous environment. Currently there are several bodies looking at creating standards for representing IT configurations, creating an integration layer between web services-based management tools and the resources they command, and ways to represent event data. However, work is still at an early stage; it is far from clear how the results will pan out.

Still the standards lay only the barest of foundations for an autonomic future. Web services represent both symptom and cure of today's data centre management ills. Because web services touch many servers, operating systems, networking devices, databases and middleware, it is difficult to pin down the root cause when things go wrong. Standards should solve that problem. "As web services are more common and ubiquitous, they do offer greater potential for dynamic automation because of the way interfaces work between the components," says Gartner's Scott. "A lot of management vendors are making web services out of their own components, turning their management functionality into a web service."

While she bemoans the slow progress on standards, Scott notes that it is in everyone's best interests for vendors to settle their differences. Ultimately, says Microsoft's Beeler, "it's not a big competitive thing. We're all competing with the 70% of the budget that is wasted on management."

Ben Rossi

Ben was Vitesse Media's editorial director, leading content creation and editorial strategy across all Vitesse products, including its market-leading B2B and consumer magazines, websites, research and... More by Ben Rossi

IBM: technology and open standards

Ben Rossi

Related Topics

Related Stories

Andrew McAfee – ‘Human beings are chronically overconfident’

Keys to effective cybersecurity threat monitoring

How businesses can vet their cybersecurity vendors

Five key signs of a bad MSP relationship – and what to do about them

Related Stories

Global file locking and the CAP theorem – how to choose your strategy

5 types of transformation fatigue derailing your IT team

Driving business growth through effective productivity strategies

Breaking down silos between IT and security teams