Update to Q1 2017: Seagate redeemed?


Update June 8 2017



After some delay, I finally got around to downloading another 9 months of data and rerunning the KM plots. Methods are documented in the first post http://bioinformare.blogspot.com/2016/02/survival-analysis-of-hard-disk-drive.html and won't be repeated here. Note that drive models with fewer than 500 units, and manufacturers with fewer than 200 units are ignored to simplify the plots - you can fix this in the code if you need. 

Images below are available for the closer inspection they deserve at https://github.com/fubar2/backblazeKM - they really are too detailed to appear here - sorry for the ugly layout here but you can download them or clone the repository if you want a closer look.

Straight to the chase. Here's the drive model survival curve to date:



The newer Seagate ST8000NM0055 is promising excellent longevity although there's only a tiny duration of observation so the initial curves may change with time. 

Also, we haven't tested if this is related to smartdrive statistics because I haven't looked at them since discovering a lot of missing data when I first saw it.


This of course leads to some improvement in Seagate's manufacturer KM curve but it's weighed down heavily by some early drives with long observation periods showing high failure rates over time - the curves for those drives drop off rapidly. I'm not sure how useful the manufacturer curves are because poorly performing individual older models may hide the appearance of what looks like a respectably long lived drive.



Note that the KM statistics below are documented with the R library so please go look there for details. The model assumes similar observation periods for all units whereas the newest drives (ST8000NM0055) are very recent and the paucity of information is not really obvious from the plots because the start is so compressed. For whatever it's worth, here are the KM stats by model:

survdiff(formula = sm ~ model, data = dm, rho = 0)

                                  N Observed Expected (O-E)^2/E (O-E)^2/V

model=HGST HMS5C4040ALE640     8642      107   580.25    385.98    441.79
model=HGST HMS5C4040BLE640    15464       92   417.12    253.41    279.56
model=Hitachi HDS5C3030ALA630  4664      144   491.60    245.78    284.93
model=Hitachi HDS5C4040ALE630  2719       78   284.46    149.85    162.69
model=Hitachi HDS722020ALA330  4774      218   447.08    117.38    130.37
model=Hitachi HDS723030ALA640  1048       71   108.93     13.21     13.62
model=ST3000DM001              4707     1705   217.78  10156.42  10723.61
model=ST31500341AS              787      216    31.94   1060.61   1068.38
model=ST31500541AS             2188      395   141.25    455.84    469.85
model=ST4000DM000             36611     1901  2129.54     24.53     41.82
model=ST500LM012 HN             806       31    38.19      1.35      1.36
model=ST6000DX000              1937       45   126.69     52.68     54.33
model=ST8000DM002              9936       71   126.93     24.65     27.02
model=ST8000NM0055             2460        2     4.14      1.11      1.13
model=WDC WD10EADS              550       60    44.89      5.09      5.13
model=WDC WD30EFRX             1321      154    99.21     30.26     30.86

 Chisq= 13239  on 15 degrees of freedom, p= 0 



Here are the KM stats by manufacturer:

survdiff(formula = s ~ manufact, data = ds, rho = 0)

                     N Observed Expected (O-E)^2/E (O-E)^2/V

manufact=HGST    24288      205   1075.1   704.174   878.487
manufact=Hitachi 13246      515   1449.0   602.059   886.691
manufact=HN        806       31     40.2     2.122     2.139
manufact=ST      60093     4687   3030.1   906.021  1958.192
manufact=TOSHIBA   545       18     21.1     0.452     0.454
manufact=WDC      4004      431    271.5    93.746    98.367


 Chisq= 2413  on 5 degrees of freedom, p= 0 




Oh, and for those wanting svg images, they and pdf's cannot be uploaded here so I'll stick them in the github repository at https://github.com/fubar2/backblazeKM just as soon as I find time to update that.

Comments

Popular posts from this blog

Survival analysis of hard disk drive failure data: Update to Q1 2016

A deeper dive into disk drive survival time