[Parted-maintainers] Bug#778712: libparted2: Breakage of RAID GPT header

jnqnfe jnqnfe at gmail.com
Fri Feb 20 20:44:29 UTC 2015


Control: severity -1 normal
Control: close -1
thanks

On Fri, 2015-02-20 at 15:12 -0500, Phillip Susi wrote:
> I'm sorry; I misread what you said.  I thought you said you had
> removed the information about the individual disks that were members
> of the array.

No problem.

> At this point the array contains a protective MBR that lists one
> partition of type ee that occupies the whole array.  Fdisk looks at
> sdb and sees the same thing.  Following the MBR is the GPT, part of
> which is missing from sdb, so fdisk treats it as corrupt, and falls
> back to printing only the MBR.

Yes, I'm with you.

> > So the phantom sdb1 device was not there when only fdisk was used 
> > (fdisk4), but does appear after using parted, whether using parted
> > to create the partition table (fdisk 2, fdisk3), or as in the last
> > test, only to view information (parted -l) after using fdisk
> > (fdisk5).
> 
> I see now.  I think you are running into a cache aliasing issue here.
>  That is to say, that the MBR of sdb was read into the cache while the
> drive was still blank, and when parted creates the gpt on the array,
> it does in fact create that protective mbr partition, but fdisk does
> not see it on sdb yet, since it is still holding the cached data from
> earlier.  Note that at this point fdisk reports that there is no
> partition table of any kind, not just no sdb1.  If you run blockdev
> --flushbufs and then repeat the fdisk -l, sdb1 should show up.

I agree now that this might just be an fdisk caching issue, but I don't
think this bit is quiet as you describe. The actions taken and results
were as follows:
1) RAID array recreated.
2) fdisk used to create GPT table on md126.
3) fdisk -l, showing no issues and no info from MBR.
4) parted -l, pointing out corrupt GPT table.
5) fdisk -l, now showing info from the MBR and the error.

So on the basis that fdisk is writing the same protected MBR that parted
does, it seems fdisk is failing to flush it's cache and see the problem
when asked to display info immediately following creation of the
partition tables. Then, either parted triggered a cache flush (shared
cache I presume?), or else fdisk managed to flush the cache the second
time around.

So in conclusion, this whole confusing mess resulted from a combination
of:
1) parted being incapable of understanding RAID array membership.
2) fdisk also being incapable of understanding RAID array membership.
3) fdisk failing to flush a cache of partition info.

I'll reduce the severity of this bug report and close it now then.

Thank you for helping get to the bottom of this.

I will try to do a little further testing tomorrow to try and nail down
more precise details of the caching behaviour, and then report that
along against fdisk with a request for fdisk to also add understanding
of RAID array membership.

Thanks again :)



More information about the Parted-maintainers mailing list