Bug#826907: mdadm: please configure either component device timeout or scterc to guard against scsi layer timeouts

Michael Tokarev mjt at tls.msk.ru
Fri Jun 10 07:43:40 UTC 2016


Control: severity -1 wishlist

10.06.2016 04:31, Henrique de Moraes Holschuh wrote:
...
> One must enable SCTERC (e.g. with smartctl -l scterr,70,500) before
> starting the array (initramfs included).  Fortunately, suport for SCTERC
> can be detected, and it can be queried, so one would only mess with it
> when unset.  A longer write timeout might help ensure the HDD has time
> to relocate the sector (there's never a good reason for an HDD with
> spare sectors still available to return a write error other than a
> SCTERC write timeout, or the spare tracks going bad/full).
> 
> Alternatively, mdadm could increase the timeout of sat/ata component
> devices in the scsi layer, from 30s to something like 120s through
> /sys/block/###/device/timeout.  This avoids worse data-loss in many
> cases, but md will hang for far longer when the component device really
> has gone to lalalala land...

I don't think this is a job of mdadm to tweak devices like this.
Devices can be programmed independently, and with different timeouts
for different functions.

Further, it is not a strog requiriment to do so in initramfs, this
programming can be done after system has finally booted.  Sure, in
case there's a drive error during bootup, it will cause these bad
effects, but the chance to hit error there is smaller.

One more thing.  Most consumer drives these days don't support SCTERC.
I've almost anecdotic situation here on our own machine.  It had 4
1Tb WD Black drives bought some years ago.  One drive failed, and we
bought a replacement, of the same model.  It reports the SAME FIRMWARE
VERSION as the older drives.  But it doesn't support SCTERC anymore,
while the older drives does.  Go figure.

On the other hand, many "enterprize" gear (machinees) don't support
JBOD mode at all, enabling only hardware raid usage.  I see less and
less machines supporting JBOD mode without additional licenses.

Having said that, I see less and less usefulness of supporting SCTERC
in the first place.  For consumers it is less useful because modern
drives don't support it, and for enterprize it is because the machines
don't support JBOD.  Oh well.

Hence changing severity to wishlist.

Thanks,

/mjt



More information about the pkg-mdadm-devel mailing list