Google Spins Out Hard Disk Paper
If you’re a hardcore hardware geek, then the work compiled in the 13-page paper by some Google engineers on the failure patterns of disk drives will be like a belated Valentine from the little red-headed girl.
Google’s paper titled ‘Failure Trends in a Large Disk Drive Population’ was crafted for the Usenix conference on File and Storage Technologies.
With little published formal research on why drives fail available, Eduardo Pinheiro, Wolf-Dietrich Weber and Luiz André Barroso did what engineers do and researched it themselves.
They suggest in the paper’s abstract that 90 percent of the new information produced around the globe has been tucked into storage on electronic media somewhere; that’s based on 2002 research performed at UC-Berkeley.
In most cases, that storage means a hard drive. Since the engineers just happen to have access to "a large disk drive population in a production Internet services deployment," they decided to look at that and see what they could learn about drive failures.
That research made it into their report (PDF). It’s what they didn’t find that proves the most interesting.
Modeling based on the self-monitoring facility, called SMART, of a drive, looking at parameters that tend to match failing drives, was dubbed "unlikely to be useful" for predicting other drive failures. "Surprisingly, we found that temperature and activity levels were much less correlated with drive failures than previously reported," they wrote.
Spinning and overheated disks have long been considered an early warning sign of drive failure, but to the engineers these symptoms were not a red flag of imminent failure.
Drives failed in their analysis without tripping any SMART indicators.
They also noted that after a drive suffered its first scan error, it was 39 times more likely to fail within a 60-day period than drives lacking those errors.
Predicting failures based on SMART alone looks like a very inaccurate endeavor, and SMART may be more useful for determining trends in a drive population instead.