Bug#611537: Bug#643507: not really solved

Diego Guella diego.guella at sircomtech.com
Fri Feb 3 08:09:39 UTC 2012


From: "Vladimir 'φ-coder/phcoder' Serbinenko" <phcoder at gmail.com>
> On 18.11.2011 16:47, Diego Guella wrote:
>> Actually, I discovered that the bug is still there for me too, although 
>> in has another shape now.
>>
>> I now have a RAID-1 with 4 members, I use 5 HDD and rotate them daily.
>> During the Debian installation, I created a 2-member RAID-1, and later I 
>> grew the array to 4 members.
>>
> If I understand this correctly your RAID never has all the devices 
> connected. This leads to big desync (even writing once to an incomplete 
> RAID causes desync).
> This is not a proper way to handle array. Frankly, I'm surprised anything 
> works at all under such abuse.

I've used my 5 device RAID1 for over 2 years with lenny, and never got those 
problems.
Maybe I'm abusing RAID, but I don't think so.

I'd like to understand better what you are talking about "desync".

This is what I do (and have done since 2 years with lenny):
4-member RAID1: (a),(b),(c),(d), plus one HDD (e) disconnected from the 
system.
1.the system is on, 4 HDDs present, RAID1 ok.
2.turn the system off
3.remove HDD (d) from the system (it was /dev/sdd)
4.attach HDD (e) to the system (it will become the new /dev/sdd)
5.power on the system

At this point, the RAID1 is in this state:
-----
root at devilserver:~# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb3[2] sdc3[3] sda3[6]
      1903708024 blocks super 1.2 [4/3] [U_UU]

md0 : active raid1 sdb2[2] sdc2[3] sda2[6]
      48827320 blocks super 1.2 [4/3] [U_UU]

unused devices: <none>
-----
(md1 is mounted on /home, md0 is mounter on /)

6.add HDD (e) to the RAID1:
-----
root at devilserver:~# mdadm /dev/md0 -a /dev/sdd2
mdadm: re-added /dev/sdd2
root at devilserver:~# mdadm /dev/md1 -a /dev/sdd3
mdadm: re-added /dev/sdd3
-----

At this point, the RAID1 is in this state:
-----
root at devilserver:~# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdd3[5] sdb3[2] sdc3[3] sda3[6]
      1903708024 blocks super 1.2 [4/3] [U_UU]
        resync=DELAYED

md0 : active raid1 sdd2[5] sdb2[2] sdc2[3] sda2[6]
      48827320 blocks super 1.2 [4/3] [U_UU]
      [>....................]  recovery =  1.1% (539648/48827320) 
finish=28.3min speed=28402K/sec

unused devices: <none>
-----

When the resync will complete, the RAID1 will be OK again.
The disconnected HDD (d) is an emergency copy of the system.
I can recover files from it, or even connect it to an identical system and 
get a working machine in 0 minutes in case of a disaster.

Now I'd like to understand:
-What's wrong with what I'm doing?
Pretend that drive (d) really dies when the system was turned off.
What I'm supposed to do in that situation? Pick a new drive (e), connect it 
to the system, and add it to the RAID1 array.
Isn't that the same?

I can grow the array to 5-members, the only downside of that is the annoying 
mail message from mdadm because of "DegradedArray event" at every boot of 
the machine.
I am open to other suggestions, too.

Diego 






More information about the Pkg-grub-devel mailing list