Bug#396582: Some additional info

Dan Pascu dan at ag-projects.com
Sat Nov 4 11:31:47 CET 2006


martin f krafft wrote:
> also sprach Dan Pascu <dan at ag-projects.com> [2006.11.03.2238 +0100]:
>   
>> But I'm glad you were able to at least see the problem I'm
>> experiencing. One thing that intrigues me is why in my case when
>> failing a drive and stopping the array, after restarting it, the
>> failed drive was already removed (even though I never removed it
>> myself) and the arrays started degraded with 1 drive out of 2, and
>> in your case the array started with the failed drive included and
>> reported that it started with 2 drives.
>>     
>
> This only happens when the last update time stored in the failed
> component's superblock is the same as the time in the other
> components superblocks. Then mdadm says that the drive is failed but
> looks okay. When I reproduced the problem, I saw exactly your
> behaviour and did not have to remove the component.
>
> Here's what I think happens exactly:
>
>   - while the array is running, writes result in updates to the
>     superblocks
>   - if you fail a component, its superblock is no longer updated
>   - when you stop/start an array, mdadm checks all the superblocks
>   - if they all seem as if they'd been stopped at the same time, it
>     just assembles.
>   - however, if a superblock seems out of date, md writes:
>
>       kernel: md: kicking non-fresh sde1 from array!
>
>     and starts the array in degraded mode.
>   

I think that explains it. Since I was using bitmaps, which were updated 
every 5 seconds, it's very likely that the superblock was updated at 
least once between the moment I failed it and the moment I stopped the 
array.
I guess you can obtain the same effect, without writing to the array 
while failed, but by manually removing the drive after failing it and 
before stopping the array. The issue seems to be when the array starts 
degraded with 1 drive missing.

-- 
Dan






More information about the pkg-mdadm-devel mailing list