Information Age: News, analysis & insight for IT & business leaders

‘Unprecedented’ storage failure disrupts US state

31 August 2010  

‘One-in-a-billion’ SAN glitch followed by redundant system failure affects a third of Virginia's government agencies

Citizens in the US state of Virginia were unable to access a number of public services after the state government’s storage area network (SAN) suffered an ‘unprecedented’ hardware failure.

The SAN, a DMX-3 from storage infrastructure vendor EMC, failed last Wednesday afternoon, causing 228 of the Virginia Information Technologies Agency’s (VITA) servers to crash. Critically, VITA’s redundant systems – the spare infrastructure designed to take the strain when primary systems fail – also malfunctioned.

This outage disrupted services for 27 state agencies, including the Department of Motor Vehicles (DMV), Department of Taxation and the State Board of Elections. Most of the services were back online by yesterday morning but the DMV is still unable to issue driver’s licenses.

EMC told VITA that the outage was “unprecedented”, according to a statement from Virginia’s secretary of technology Jim Duffey. “The manufacturer reports that the system and its underlying technology have an exemplary history of reliability, industry-leading data availability of more than 99.999 percent and no similar failure has occurred”

VITA is now “working tirelessly” to restore lost data, Duffey said.

Earlier this year, US email hosting provider Intermedia suffered an outage after a hardware failure in its EMC storage area network. “This failure caused the entire load for that SAN to be shifted to the service processor on the redundant controller node,” the company said at the time. “The spare capacity on the single service processor was not enough to handle the entire load of all systems connected to the SAN.”


Comments  [1]

Christopher Petersen
Thursday 9th September 2010

I'm not in a position to comment on this particular outage, but I'd hardly call the "failure" of a DMX-series storage array unprecedented. A few years ago, we saw a DMX-3000 lock up in such a way that all storage services were effectively "down" for about 30 hours. Luckily, we did not lose any critical data, but multiple levels of EMC "health checks" came back clean before someone on their staff found the real culprit inside the storage array and got it working again.

As to the closing paragraph of this article, "storage processors" and the failure/mode described sound like a feature of the EMC CLARiiON line of storage arrays rather than the DMX line.

Report this comment »

People who read this also read...

 

White Papers

Read article

11 Hiring Trends for 2011

In this document, you'll get the insider info you need to give potential employers what they want and beat your competition in 2011. You'll learn about the most valuable certifications and the game-changing skills that can lead to more job security and stability.

Read article

12 Hiring Manager Secrets to Getting the IT Job You Want

Learn how you can make yourself a more attractive candidate now with PrepLogic's free 12 Hiring Manager Secrets to Getting the Job You Want.

Read article

1Z0-040 Oracle Database 10G New Features for Administrators Practice Exam

Oracle 9i administrators can certify on Oracle 10G by passing this exam. The ExamForce 1Z0-040 Oracle Database 10G New Features for Administrators practice exam provides their unique triple testing mode to instantly set a baseline of your knowledge and focus your study where you need it most.

More
Advertisement
div class="banner">