‘Unprecedented’ storage failure disrupts US state
- Reduce text size Decrease text size
- Increase text size Increase text size
- Print article Print
- Jump to comments Comment
- Share this article Share
- Email article to a friend Email
‘One-in-a-billion’ SAN glitch followed by redundant system failure affects a third of Virginia's government agencies
Citizens in the US state of Virginia were unable to access a number of public services after the state government’s storage area network (SAN) suffered an ‘unprecedented’ hardware failure.
The SAN, a DMX-3 from storage infrastructure vendor EMC, failed last Wednesday afternoon, causing 228 of the Virginia Information Technologies Agency’s (VITA) servers to crash. Critically, VITA’s redundant systems – the spare infrastructure designed to take the strain when primary systems fail – also malfunctioned.
This outage disrupted services for 27 state agencies, including the Department of Motor Vehicles (DMV), Department of Taxation and the State Board of Elections. Most of the services were back online by yesterday morning but the DMV is still unable to issue driver’s licenses.
EMC told VITA that the outage was “unprecedented”, according to a statement from Virginia’s secretary of technology Jim Duffey. “The manufacturer reports that the system and its underlying technology have an exemplary history of reliability, industry-leading data availability of more than 99.999 percent and no similar failure has occurred”
VITA is now “working tirelessly” to restore lost data, Duffey said.
Earlier this year, US email hosting provider Intermedia suffered an outage after a hardware failure in its EMC storage area network. “This failure caused the entire load for that SAN to be shifted to the service processor on the redundant controller node,” the company said at the time. “The spare capacity on the single service processor was not enough to handle the entire load of all systems connected to the SAN.”






I'm not in a position to comment on this particular outage, but I'd hardly call the "failure" of a DMX-series storage array unprecedented. A few years ago, we saw a DMX-3000 lock up in such a way that all storage services were effectively "down" for about 30 hours. Luckily, we did not lose any critical data, but multiple levels of EMC "health checks" came back clean before someone on their staff found the real culprit inside the storage array and got it working again.
As to the closing paragraph of this article, "storage processors" and the failure/mode described sound like a feature of the EMC CLARiiON line of storage arrays rather than the DMX line.
Report this comment »