Update to Q1 2017: Seagate redeemed?
Update June 8 2017
After some delay, I finally got around to downloading another 9 months of data and rerunning the KM plots. Methods are documented in the first post http://bioinformare.blogspot.com/2016/02/survival-analysis-of-hard-disk-drive.html and won't be repeated here. Note that drive models with fewer than 500 units, and manufacturers with fewer than 200 units are ignored to simplify the plots - you can fix this in the code if you need.
Images below are available for the closer inspection they deserve at https://github.com/fubar2/backblazeKM - they really are too detailed to appear here - sorry for the ugly layout here but you can download them or clone the repository if you want a closer look.
Straight to the chase. Here's the drive model survival curve to date:
The newer Seagate ST8000NM0055 is promising excellent longevity although there's only a tiny duration of observation so the initial curves may change with time.
Also, we haven't tested if this is related to smartdrive statistics because I haven't looked at them since discovering a lot of missing data when I first saw it.
This of course leads to some improvement in Seagate's manufacturer KM curve but it's weighed down heavily by some early drives with long observation periods showing high failure rates over time - the curves for those drives drop off rapidly. I'm not sure how useful the manufacturer curves are because poorly performing individual older models may hide the appearance of what looks like a respectably long lived drive.
Note that the KM statistics below are documented with the R library so please go look there for details. The model assumes similar observation periods for all units whereas the newest drives (ST8000NM0055) are very recent and the paucity of information is not really obvious from the plots because the start is so compressed. For whatever it's worth, here are the KM stats by model:
survdiff(formula = sm ~ model, data = dm, rho = 0)
N Observed Expected (O-E)^2/E (O-E)^2/V
model=HGST HMS5C4040ALE640 8642 107 580.25 385.98 441.79
model=HGST HMS5C4040BLE640 15464 92 417.12 253.41 279.56
model=Hitachi HDS5C3030ALA630 4664 144 491.60 245.78 284.93
model=Hitachi HDS5C4040ALE630 2719 78 284.46 149.85 162.69
model=Hitachi HDS722020ALA330 4774 218 447.08 117.38 130.37
model=Hitachi HDS723030ALA640 1048 71 108.93 13.21 13.62
model=ST3000DM001 4707 1705 217.78 10156.42 10723.61
model=ST31500341AS 787 216 31.94 1060.61 1068.38
model=ST31500541AS 2188 395 141.25 455.84 469.85
model=ST4000DM000 36611 1901 2129.54 24.53 41.82
model=ST500LM012 HN 806 31 38.19 1.35 1.36
model=ST6000DX000 1937 45 126.69 52.68 54.33
model=ST8000DM002 9936 71 126.93 24.65 27.02
model=ST8000NM0055 2460 2 4.14 1.11 1.13
model=WDC WD10EADS 550 60 44.89 5.09 5.13
model=WDC WD30EFRX 1321 154 99.21 30.26 30.86
Chisq= 13239 on 15 degrees of freedom, p= 0
Here are the KM stats by manufacturer:
survdiff(formula = s ~ manufact, data = ds, rho = 0)
N Observed Expected (O-E)^2/E (O-E)^2/V
manufact=HGST 24288 205 1075.1 704.174 878.487
manufact=Hitachi 13246 515 1449.0 602.059 886.691
manufact=HN 806 31 40.2 2.122 2.139
manufact=ST 60093 4687 3030.1 906.021 1958.192
manufact=TOSHIBA 545 18 21.1 0.452 0.454
manufact=WDC 4004 431 271.5 93.746 98.367
Chisq= 2413 on 5 degrees of freedom, p= 0
Oh, and for those wanting svg images, they and pdf's cannot be uploaded here so I'll stick them in the github repository at https://github.com/fubar2/backblazeKM just as soon as I find time to update that.
Comments
Post a Comment