[Parted-maintainers] Bug#778712: libparted2: Breakage of RAID GPT header

jnqnfe jnqnfe at gmail.com
Fri Feb 20 17:17:19 UTC 2015


On Fri, 2015-02-20 at 10:16 -0500, Phillip Susi wrote:
> On 2/19/2015 2:24 PM, jnqnfe wrote:
> > Firstly, I am not running fdisk or parted on the raw member disks,
> > I am simply running generic 'fdisk -l' and 'parted -l' commands,
> > which return information about all disks. To simplify matters I
> > removed information about other disks in my system from the output
> > I supplied, leaving only that pertaining to the array and array
> > member disks.
> 
> You did not; the output you supplied listed both sda and sdb.

What? I very carefully went through every one of them before sending to
ensure that only information about the array (md126) and the array
members (sdb and sdc) were included. I have just checked back over every
one of those files attached to the original bug report and none of them
contain any info about sda.

Do please note that in some of them sdc has been output before sdb, so
perhaps you didn't look carefully enough and misread sdc for sda in
these cases? I really don't know otherwise why on earth you think I've
sent info about sda.

> The GPT is 16 KiB but starts on sector 2, hence the last 2 sectors
> fall onto the second disk.

Okay, I'll take your word on that and thus that explains sufficiently
why parted things there's corruption.

> Because parted does not know anything about raid.  I suppose it might
> be nice if it could detect it and ignore those drives, but doing so
> would require adding a dependency on udev or blkid.  I'll mull the
> idea over.

Okay. I do think that it would be a very good idea for parted to do
this.

We can put that stuff to one side then and focus on this phantom sdb1
device...

> > Furthermore, if you look at the fdisk output I supplied, you
> > should notice that when I created the partition table with fdisk,
> > everything was initially fine; no 'dev/sdb1' device exists (see
> > fdisk4). However after running 'parted -l' to see what parted makes
> > of the result of using fdisk, and then re-running 'fdisk -l' (I
> > just happened to do so to be certain everything was fine, and found
> > to my surprise it was not), you can see that now all of a sudden a
> > /dev/sdb1' device exists.
> 
> sdb1 shows up in fdisk2.

Yes, but please review the initial bug report for when I created each of
the output files. I ran three tests using different tools to create the
GPT headers, first with gparted, then with parted, then with fdisk.
Before each test I deleted and recreated the RAID array to try and
achieve a fresh start (which checking fdisk and parted info after doing
so confirmed was a successful means of resetting things). Files fdisk1
and parted1 demonstrate the state of things directly after recreating
the RAID array, without yet attempting to write the partition table.

So, fdisk2 and parted2 show the state of things after using gparted to
write a GPT table to the array, and thus this phantom sdb1 device
exists, which fdisk doesn't like.

Starting afresh, I then did the same thing but using parted. You can see
the state of things afterwards in fdisk3 and parted3. Again, as you can
see in fdisk3, this phantom sdb1 device exists which fdisk doesn't like.
No difference from using gparted.

Finally I started things afresh once more and used fdisk to create the
GPT partition table. The state of things after this according to fdisk
(which I checked first) and which you can see in fdisk4 shows no sign of
this phantom sdb1 device. So everything seems fine at this point
according to fdisk. I then checked the state of things with parted,
which you can see in file parted4. Then I checked fdisk one more time,
and that phantom sdb1 device is back, as can be seem in fdisk5.

So the phantom sdb1 device was not there when only fdisk was used
(fdisk4), but does appear after using parted, whether using parted to
create the partition table (fdisk 2, fdisk3), or as in the last test,
only to view information (parted -l) after using fdisk (fdisk5).

As I said in my last email, I am not outright claiming that parted is
definitely directly responsible for creating this phantom device, but it
is a pretty damning coincidence that it has so far only appeared after
running parted.

> The moment you created the GPT table on the raid array, it included
> the protective MBR partition, and that is what fdisk is reporting
> since the GPT is corrupt ( when viewed through the lens of the single
> disk ).  lsblk uses the blkid database which does recognize that the
> disks are array components and filters them out.

Okay, I am aware that a protective MBR may be written alongside the GPT
tables and that the protective MBR may contain a partition entry
covering the entire disk. So you're suggesting that this may be what
this phantom sdb1 device is? Interesting.

But, what is the explanation for it not appearing in fdisk ouput after
using fdisk to create the GPT tables in test #3? And furthermore what is
the explanation for it then suddenly appearing after then running
'parted -l'? And if that is the case then that would imply that fdisk
also may not be properly paying attention to the fact that these are
array members.

If fdisk is setting the protective MBR partition to the size of a member
disk, rather than the size of the array, that would explain the fdisk
error not showing up after using fdisk to create the partition tables.
And if parted is doing the opposite, setting the array size here, that
would explain the presence of the error in fdisk output after using
parted to do it. Which if so, begs the question, which is right? Should
fdisk be changed to use the array size (alongside paying proper
attention to array membership of course), or should parted be using the
disk size?

This explanation still leaves two things unaccounted for though:
1) After creating the GPT partition tables with fdisk, if it's looking
at the protective MBR because it thinks that the GPT table of this
individual disk is corrupt, why does it not list sdb1 in the output like
it does when it also reports the error.
2) Why is the size of this protected MBR partition seen as fine by fdisk
after using fdisk to create the GPT tables, and then suddenly not fine
after using 'parted -l'? Is 'parted -l' changing the size, or is there
some delayed reaction to updating the information fdisk is reading and
'parted -l' may not be changing anything at all?

I appreciate your patience towards getting to the bottom of this.



More information about the Parted-maintainers mailing list