Bug#728677: mdadm: has no superblock - assembly aborted (but it has)

Michael Tokarev mjt at tls.msk.ru
Fri Dec 5 10:23:19 UTC 2014


Control: tag -1 + moreinfo

[Replying to a relatively old bugreport]

04.11.2013 06:27, Marc Lehmann wrote:
> Package: mdadm
> Version: 3.3-2
> Severity: normal
> 
> Dear Maintainer,
> *** Please consider answering these questions, where appropriate ***
> 
> I attached 4 disks to my computer (already booted) and used mdadm to
> create a new raid5 volume, created a partition, formatted it, wrote data
> on it.
> 
>    mdadm /dev/md/doom --create -e 1.0 -z 3900390625 -v -n4 -c 256 -l 5 -p ddf-zero-restart --assume-clean \
>       ata-HGST_HDS724040ALE640_1 \
>       ata-HGST_HDS724040ALE640_2 \
>       ata-HGST_HDS724040ALE640_3 \
>       ata-HGST_HDS724040ALE640_4
> 
> Later, i deattached it, and on the next day, reattached it again, used mdadm to assemble the array, everything worked fine:
> 
>    mdadm -A /dev/md/doom \
>       ata-HGST_HDS724040ALE640_1 \
>       ata-HGST_HDS724040ALE640_2 \
>       ata-HGST_HDS724040ALE640_3 \
>       ata-HGST_HDS724040ALE640_4
> 
> Then I rebooted. After the reboot, I again tried to assemble the array, to no avail:
> 
>    mdadm: no RAID superblock on ata-HGST_HDS724040ALE640_2
>    mdadm: ata-HGST_HDS724040ALE640_2 has no superblock - assembly aborted
> 
> Now, when I try to recreate the array (first mdadm command), I get:
> 
>    mdadm: ata-HGST_HDS724040ALE640_1 appears to be part of a raid array:
>           level=raid5 devices=4 ctime=Thu Oct 31 15:13:13 2013
>    mdadm: ata-HGST_HDS724040ALE640_2 appears to be part of a raid array:
>           level=raid0 devices=0 ctime=Thu Jan  1 01:00:00 1970
>    mdadm: partition table exists on ata-HGST_HDS724040ALE640_2 but will be lost or
>           meaningless after creating array
>    mdadm: ata-HGST_HDS724040ALE640_3 appears to be part of a raid array:
>           level=raid5 devices=4 ctime=Thu Oct 31 15:13:13 2013
>    mdadm: ata-HGST_HDS724040ALE640_4 appears to be part of a raid array:
>           level=raid5 devices=4 ctime=Thu Oct 31 15:13:13 2013
>    mdadm: automatically enabling write-intent bitmap on large array
>    Continue creating array?
> 
> At that point I looked at /dev/mdstat, and saw that the array was already
> assembled - without the disk #2.
> 
>    Personalities : [raid6] [raid5] [raid4] 
>    md127 : active (auto-read-only) raid5 sdb[0] sdf[3] sde[2]
>          11701171200 blocks super 1.0 level 5, 256k chunk, algorithm 1 [4/3] [U_UU]
>                
>          unused devices: <none>
> 
> There are two pairs of disks, and each is on a different on-board
> controller, so I cannot explain why, apperently, one disk was kicked out
> on reboot (there are no I/O errors in the kernel log).
> 
> In any case, the message that mdadm prints (... has no superblock) is
> simply wrong.
> 
> Formatting the drive with --assume-clean made the disks reappear - I know
> they must be clean because the device was unmounted and not in use before
> and after the reboot, so nothing should have written to it.

I'm not sure what to make out of this bugreport really.  Tagging with
`moreinfo' for now.

I don't know what happened, and how superblock temporarily disappeared
from the device ata-HGST_HDS724040ALE640_2.  But what mdadm tells you
is what it finds.

The only way to verify what's going on is to see what mdadm does when
you asked it to assemble the array when it complained about missing
superblock.  Like, running it under strace to see what it is doing and
what system returns to its requests.

What I notice however is that ata-HGST_HDS724040ALE640_2 contains a
partition table, and mdadm correctly detects this.  So this is a first
data device in the array, and actual data starts there.  With superblock
format v. 1.0, this is also the start of the drive, ie, the same partition
table exists on this drive and on your array.  This is a sure way of
making problems, since the system first see your ata drive and tries to
parse partition table in there, and might even try to mount some partitions,
making the drive busy, so mdadm is unable to open it for assembly.

When you create raid5 array, use superblock version 1.1 or 1.2 (1.2 is
the default), so no other tools will try to access your component devices
outside the array.

With this context I'd say it is an operator error.

Thanks,

/mjt



More information about the pkg-mdadm-devel mailing list