Bug#780207: default timeouts causing data loss

Chris email.bug at arcor.de
Thu Mar 19 14:36:20 UTC 2015


Control: Severity 780207 serious
Control: Severity 780162 serious

I've thought about the serverity some more, and conclueded I'll do an
attempt setting severity back to serious:

The affected user base is very large (with regular non-raid drives).
An occasional read/or write error can happen anytime (as in "imminent").
And more severely: such errors that would regularly be recoverable by
the drives firmware (with proper timeouts) are nothing unusual with
magnetical disks.
But for the affected user base the recovering process of the drive will
be interupted by a complete controller reset (risking data).

As long as one has not been hit by this bug it may not seem too severe,
but that changes as soon as some common intermittent disk read/write
error (possibly even unavoidable over time), that could be perfectly
recoverable by the firmware with correct default timeouts, has
silently caused data or redundancy loss or corruption.



More information about the pkg-mdadm-devel mailing list