Storage costs have never been lower. Data volumes have never been higher. For many hard-pressed IT managers, that makes buying more hardware to accommodate burgeoning data seem a relatively easy option. In a recent survey by market research company Vanson Bourne, 80% of respondents admitted that they simply purchase more storage rather than investigate ways of getting more from existing resources.
The view that this approach is “pain-free”, however, could not be more wrong, says Jon Pavitt, professional services manager at StorageTek UK and Ireland. “It quickly becomes extremely expensive – not just in terms of capital expenditure but also in terms of management costs,” he says. By first rationalising the existing storage infrastructure to maximise its use, IT managers can delay or eliminate the need for new primary capacity to manage information growth.
Not only that, but efficiencies can also be achieved through eliminating redundant copies, moving less critical data to less expensive storage media and recovering over-reserved capacity.
That is where information lifecycle management (ILM) comes in. ILM concepts can be applied to assess data uses and storage assets, identify inefficiencies and adjust the infrastructure to maximise utilisation. It is much needed. According to research by IT market analyst company Gartner, utilisation of typical direct-attached storage rarely exceeds 50%. Highly efficient networked storage environments that are organised according to ILM principles, by contrast, can operate at 70% to 90% capacity utilisation.
But that is entirely dependent on an organisation having a good understanding of the types of data it holds, the value that data has, and where it is stored. ILM
works on the principle that all data is not equal and that it is used in different ways at different stages of its lifecycle. For example, monthly financial reports may combine sales orders, shipments, inventory expense and other data for the month.
During the processing cycle, the finance department needs access to verify and analyse this data frequently and rapidly. After the reports are generated, however, the previous month’s data is referenced less frequently as the focus changes to data for the current month. Previous reports can then be migrated to less-available, lower-cost storage, thus freeing up primary, high-cost storage.
Few companies, as yet, have that level of visibility into their data environment and the needs of different data categories, says Pavitt. As a result, one of the main roles his team performs is to help companies to gain that insight. “Our job, in some ways, is to make a nuisance of ourselves,” he jokes. “We go into an organisation and ask lots of questions about what data they have and where it is kept. They frequently don’t have answers – but that’s the whole point.”
The first step in getting those answers is a review of the current storage environment by evaluating data, categorising it, and then applying business rules to each category.
“Data valuation starts with listing data types and ranking them based on how often the data is accessed, who uses it, and so on,” explains Pavitt. “IT managers can then map the current location of the data across a hierarchy of storage systems, from high-performance disk to low-cost tape systems.”
“It’s vital to analyse current storage usage before embarking on any major storage implementation or upgrade. You need to know what you’re holding, where you hold it and what data has specific compliance requirements,” says Nigel Ghent, UK marketing director at storage supplier EMC. “That is a huge job of work in itself,” he adds.
There are ways, however, to minimise the effort involved, says Phil Goodwin, an analyst at IT market research company the Meta Group. “Obviously, treating each data type individually is impractical. The number of associated policy implementations would be unmanageable,” he says.
Instead, he suggests classifying data elements according to specific attributes (see table, Sample data element attributes and categories). “The objective is to reduce the number of elements, and therefore the number of policies, to an amount that can be effectively implemented and managed,” he says.
From there, business rules can be applied to different data types (see table, Sample business rules matrix).
That auditing process, however, should not be too exhaustive, warns Simon Gay, consulting practice leader at systems integration company, Computacenter: “The danger of the categorisation process is that some companies are all talk and no action. You could explore endlessly the kinds of data your organisation holds and how it should be stored, but until you start putting ILM into practice, you’re probably no closer to a compliance situation.”
Early wins can be gained, he says, by identifying the most important data a company holds, and applying ILM policies to them, before moving on to less significant data groups.
That assessment should give storage administrators a clear view of the requirements of different data types and how well these requirements are currently being met. They are then in a position to move data where appropriate to lower-cost and lower-performance storage classes, while still delivering adequate performance for mission-critical applications.
That requires an assessment of the tiers of storage that exist below primary, high-cost, high-performance disk, says Tim Mortimer, business manager at storage integration company InTechnology. “We generally advocate four levels of storage. First: fast access, high-performance primary disk. Second: low-cost disk such as SATA [serial advanced technology architecture] disk. Third: tape technology where data must be retained but is unlikely to be referenced again. Fourth: offline tape in a secure facility, possibly offsite, which can be manually reintroduced into a tape library in the very unlikely event that it needs to be recalled,” Pavitt says.
The emergence of SATA disks has done much to boost ILM efforts, he says. SATA arrays can store data at a fraction of the cost of high-performance disk. When shifting point-in-time copies of data, for example, SATA is often the obvious choice.
That is not to say, however, that tape technology is becoming redundant, argues Derek Lewis at Morse: “Despite advances in disk, there are still huge advantages to tape technology. I’ve recently seen tape technology that can hold around 1.5 terabytes on an £80 tape. The costs involved in storing large volumes of data that will probably not be accessed again on tape are now staggeringly low, and disk – even low-cost disk — still cannot match them.”
One StorageTek customer, for example, uses ILM to balance availability and cost by automating payroll data management and migration. Payroll processing is a mission-critical application, so it made sense to store the data on high-performance disk during the processing cycle and replicate it every two hours.
Once the pay cycle is complete, the automated management system now moves payroll data to mid-range SATA disk arrays. At this stage, users can access payroll data from the company’s web site for a period of three months.
After three months, the data is written to a tape library, which is on the same campus as the data archive. For disaster recovery protection, the data is replicated to a remote location, where it is stored on a back-up tape library.
ILM is an ongoing process – data storage administrators will need to continually maintain a balance between data performance needs and storage options, says Pavitt of StorageTek. “The struggle is to get the client to realise that getting benefit out of ILM is only 20% about technology and 80% about business processes. It’s that kind of housework that drives the biggest savings,” he says.