How is a data centre like a software development house?

How is a data centre like a software development house? It sounds like the beginning of a bar bet, but the truth is that the two have a great deal in common. Besides the long hours, the need for pizza at odd hours of the night, and the always-on air conditioning in the office, both organisations have to tread very lightly when making changes to their work.

When writing software, programmers test all changes against the previous versions of their work to make sure that the changes have not damaged the delicate balance between different elements of a program, and a program’s interaction with a computer or network.

The best way to do this, professionals agree, is via regression testing – where programmers test changes to code in software and compare those changes with the performance of software before the changes were implemented.

It’s the best way to counteract the “butterfly effect” – where even a slight change in one part of the system” can throw an entire project out of whack – whether it’s a program with thousands of lines of code that needs to interact with a computer, network, and user input, or a data centre that has thousands of programs and hundreds of pieces of hardware and peripherals that need to work in tandem.

For programmers, regression testing represents an important tool to ensure that changes to code do not negatively impact the performance of the program. Using regressive testing tools, programmers can test changes to the code they are working on and examine the performance of the program before and after those changes. And the regressive testing tools in use today are automated – allowing for the examination of changes for each line of code changed, added, or removed, a task that would be impossible without automated tools.

For example, an application that enables medical staff to update details of patients in a health clinic could include buttons that allow for Adding, Deleting, or Saving information. The program does what it is supposed to do, but now management has decided that it wants a refresh function added to update a record in real time, Programmers add that module, and it works as well – but overall performance of the program suffers. Why? And what can be done about it?

With regression testing tools, the programmers can evaluate the full impact of changes wrought by the addition of the module on the rest of the computing environment, and on the rest of the program. If a change negatively impacts the performance of a program in any way – if, for example, a change to the user interface results in a delay in connecting to the Internet – programmers know right away, and can correct or revise the changes they made in order to ensure optimal performance.

Data centre managers face the same dilemma. The addition of new software, services, equipment, and clients to the “mix” can impact the performance of the whole, creating havoc – not just for a group of programmers working on deadline, but for thousands of workers in hundreds of firms that have migrated most of their major operations to the cloud, which resides on the servers of the data centre.

An automated regression testing system – where the impact of each change or addition is analysed, and performance pre- and post-change is examined and compared – could save data centre managers, along with their cloud customers, a great deal of anguish, frustration, and money.

Could something as simple as a software upgrade to a single system – one of thousands in a data centre – really have such an impact? Ask the folks at the New York Stock Exchange, who found themselves facing a major outage on July 8, 2015, a day that will live in infamy for the many traders who were stuck for hours while the world’s most important stock trading platform was out of commission.

A subsequent examination proved that it wasn’t something as scary – or sexy – as hackers or cyber-terrorists that had brought the market to a halt, but a “glitch” in the rollout of a new version of software used by the NYSE. According to the Exchange, “the rollout of a software release” that was loaded onto computers “not loaded with the proper configuration compatible with the new release” caused the outage. Although not a data centre per se, the NYSE’s platform is depended upon by a great many people – just as a data centre is.

The same applies to any change. Outages are far too common at data centres, and can occur for any number of reasons – even something as simple as adding more storage or applying routine patches and updates can cause one, and incorrect driver and firmware configuration, or patches inconsistently applied, could leave systems exposed.

Of course, many changes performed to the data centre are far more complex – for example, major updates of the virtualisation software that involves adapting to hundreds of new vendor best practices, and, if incorrectly designed and performed, might affect thousands of VMs. With millions of possibilities, many of them interconnected and with mutual dependencies, there is no way for a human being to figure it out.

In fact, according to a recent study by the University of Chicago, the most common reason for outages in organisations is – “unknown.” If a regression testing approach were adopted for IT change validation, a process could be established to thoroughly assess the correctness of each change by validating that no critical vendor best practices had been breached, that changes were consistently applied throughout the entire datacenter, and that important resiliency KPIs were still met.

Of course, given the complexity of a modern data centre and the frequency of change, it would make sense to fully automate regression testing – with a workflow that flags each deviation in the IT change control environment. An approach like this could guarantee that even small changes that might escape the attention of workers will be tested individually.

Sourced by Yaniv Valik, VP Product of Continuity Software

The UK’s largest conference for tech leadership, Tech Leaders Summit, returns on 14 September with 40+ top execs signed up to speak about the challenges and opportunities surrounding the most disruptive innovations facing the enterprise today. Secure your place at this prestigious summit by registering here

Nick Ismail

Nick Ismail is a former editor for Information Age (from 2018 to 2022) before moving on to become Global Head of Brand Journalism at HCLTech. He has a particular interest in smart technologies, AI and... More by Nick Ismail

How is a data centre like a software development house?

Nick Ismail

Related Topics

Related Stories

Data storage problems and how to fix them

Combining Qumulo integration with open source backup software

Combining block, file and object storage in one cluster technology

Overcoming data loss from embedded devices

Related Stories

Future challenges and innovations in cloud security platforms

CMA to probe big tech cloud providers for market dominance

Einstein 1 platform announced at Dreamforce

Two-thirds of small businesses plan to cut cloud spending