The ultimate goal of modern storage technology may be to make it look as if all of an organisation’s data is available as one unified virtual resource. Underneath this logical veneer, however, the physical reality of storage systems is as complex as ever – a constantly shifting jigsaw composed of legacy, mainstream and emerging technologies which is as necessarily varied as the applications that it ultimately supports.
In recent years the major evolutionary theme in the storage arena has undoubtedly been the shift away from direct attached to storage (DAS) to storage area networks (SANs) and network attached storage (NAS) systems. In fact, taken as a whole, sales of networked storage units exceeded those of DAS for the first time in 2003. DAS still persists in most data centres, and will continue to do so for as long as the legacy applications it supports remain in service. However, there is no doubt that new storage investment will largely go to building SAN and NAS systems and, increasingly, systems that combine elements of both these technologies.
Certainly, very few organisations can really make an either/or choice. SAN systems provide the fined-grained, so-called block-level access to data, which is needed to optimise the performance of databases and database-centric applications such as customer relationship management (CRM) and enterprise resource management (ERP). NAS, on the other hand, better supports the kind of file-level access needed for document, or file-centric applications such as office, workflow and collaborative environments.
The service level requirements of some very large applications may justify an investment in a pure SAN or NAS system, but in practice most organisations will begin to blend the two. Typically, this will involve deploying NAS ‘headers’: devices which apply NAS management logic to shared disk resources, rather than the dedicated disk arrays used in first generation NAS servers.
Thanks to SAN/NAS convergence all data can now be held in a single physical SAN, but the system’s capacity can be logically partitioned to accommodate both block-level access over Fibre Channel (FC), SCSI or iSCSI, and file-level access over IP. For block-level access, devices which do not support native SCSI interfaces or FC host bus will require appropriate bridges to link to the SAN switches or, if desired, to connect directly to a server. For file-level access, all servers access data through a NAS header.
Once users have chosen which access method to use for each application, they must then choose an appropriate communications protocol based on price/performance parameters. Fibre Channel still offers the best price/performance profile, and can operate across distances of around 100km.
To link more distant data centres, FC can be encapsulated in IP packets (FCIP), and where global reach is needed, users can deploy FC over wave division multiplex links (FC-over-WDM).
The drawback of FC is cost. FC host bus adaptors (HBAs) are expensive and, where their expense is not justified by service level requirements, slower, shorter distance links can be achieved using commodity disk interfaces such as SATA (serial ATA), or the ‘economy-class’ technology of iSCSI (SCSI over IP).
iSCSI is increasingly being spoken of as cheaper, mainstream alternative to FC. However, most tape and disk devices still do not have native iSCSI connectivity, so any perceived cost saving will normally be cancelled out by the need to install an expensive bridging device.
SATA, a more muscular version of the ATA interface built into commodity, PC-class disks, does offer a cheaper, nearline alternative to FC disk that data can be written to and retrieved from more rapidly than tape. Backups can then be sent from SATA to tape without disruption to the production environment.
Similar price/performance issues come into play when archiving to tape is considered. DAT, VX1 or AIT-1 are all popular, economic formats for backing-up local servers, but they do not offer the performance required for larger systems.
Here, digital linear tape (DLT) format devices supporting transfer rates of 2.5Mbps are a better option, and are moderately priced. However, DLT has reached the physical limit of its capacity (80GB), forcing larger users to invest in SuperDLT, LTO and SAIT. Using 2:1 compression these formats can handle up to 600GB, 400GB and 1.3TB of data respectively. Some tape libraries will support both DLT and SDLT formats, but SAIT, the largest capacity format, uses an entirely different write/record technology and has proved unpopular in many organisations across Europe.
Probably the most important architectural consideration facing storage buyers today is that of resource virtualisation and management automation. The issues here are complex, and many of the technologies are immature and proprietary, so users should proceed with caution.
For instance, many vendors see the development of virtualisation centring on deployment of intelligent switches in the so-called SAN fabric. These devices will manage storage traffic according to user-defined priorities, and will ultimately dynamically match SAN behaviour to the shifting requirement of applications. As such, they promise to be a key element in the realisation of grid and utility computing models. But the products available today are little more than prototypes, and the opportunities for interoperation are slight.
In the meantime, users can choose to deploy proprietary hierarchical storage management (HSM) and information life-cycle management (ILM) regimes with some confidence that standards will eventually emerge that allow them to be supported from multiple devices. However, a hasty commitment to full-scale storage virtualisation at this stage may be a decision that some users may come to regret.