Bug#478238: grub-probe: fails to find drive for /dev/sda10

Török Edwin edwintorok at gmail.com
Sun May 11 11:18:58 UTC 2008


[sending to grub-devel@ as requested]

Robert Millan wrote:
> On Sun, May 04, 2008 at 05:01:32PM +0300, Török Edwin wrote:
>   
>>>>    Device Boot      Start         End      Blocks   Id  System
>>>> /dev/sda1   *           1        1275    10241406    7  HPFS/NTFS
>>>> /dev/sda2            1276        2248     7815622+  a6  OpenBSD
>>>> /dev/sda3            2249        5289    24426832+   f  W95 Ext'd (LBA)
>>>> /dev/sda4            6080        7296     9775552+  bf  Solaris
>>>> /dev/sda5            2249        2371      987966   82  Linux swap / Solaris
>>>> /dev/sda6            2372        3587     9767488+  83  Linux
>>>> /dev/sda7            3588        3600      104391   83  Linux
>>>> /dev/sda8            3601        4863    10145016   8e  Linux LVM
>>>> /dev/sda9            4864        5228     2931831   a6  OpenBSD
>>>> /dev/sda10           5229        5289      489951   83  Linux
>>>>         
>> [...]
>> grub> ls (hd0,10)
>> error: unknown device
>> grub> ls (hd0,11)
>> error: unknown device
>> grub>
>>     
>
> I tried reproducing your setup, but I can't hit the same bug.  This starts to
> look really nasty.  Just spotted this:
>
>   /build/buildd/grub2-1.96+20080426/partmap/pc.c:141: partition 0: flag 0x80, type 0x7, start 0x3f, len 0x1388afc
>   [...]
>   /build/buildd/grub2-1.96+20080426/partmap/pc.c:141: partition 0: flag 0x0, type 0x82, start 0x2270f07, len 0x1e267c
>
> for which I can't find any explanation other than memory corruption.  Also,
> due to a missing fflush() call the output is somewhat scrambled, which makes
> it harder to track (I fixed this already in upstream).
>
> Could you:
>
>   - Apply the attached patch & run grub-probe again (this time output
>     will be a bit more readable)
>   

There was no patch attached, however I did a 'cvs diff -u -D2008-04-30',
and applied that patch.
I found what the problem is, and it also explains why you couldn't
reproduce the problem.

/dev/sda9 is not a valid OpenBSD partition, and in partmap/pc.c:176 the
iteration fails with an error: invalid disk label magic 0x%x.
If I replace that return with a continue, it works.

The problem is that grub2 stops looking for more partitions as soon as
it encountered the invalid partition,
grub 0.97 was working perfectly and I never noticed the partition has
the wrong type!

Also if I change the partition type to 83 (as it should be) an unpatched
grub-probe can find that /boot is on /dev/sda10:
# grub-probe -t device /boot
/dev/sda10

I think grub2 should handle errors more gracefully, eventually mark the
partition as invalid, and keep going.
grub-probe was looking for /dev/sda10, and it shouldn't be affected by
/dev/sda9 being corrupted/invalid.
Think of it this way: if a partition gets corrupted, that shouldn't
prevent from booting, assuming the boot and root partitions are still ok.

Compare what grub-emu says when sda9 has wrong type:

grub> ls (hd0,10)
error: unknown device

And this is what it says when sda9 has the correct type:
grub> ls (hd0,10)
      Partition hd0,10: Filesystem type ext2, Label debian_BOOT



>   - Send it to grub-devel at gnu.org
>   
Done
>   ?
>
> Maybe someone there has an idea, but if it's memory corruption and we can't
> reproduce it, tracing the problem remotely isn't going to work very well.
>   

It wasn't memory corruption, however I have run valgrind and it has
shown some leaks, plus call to stat() with NULL parameter.
The attached patch fixes some valgrind warnings. Some leaks still
remain, I attached the new valgrind logs.

P.S.: grub2 seems to work now, I am able to boot with it with the
text-mode menu. The default graphics mode doesn't work I will open a
separate bug about that.

Best regards,
--Edwin

-------------- next part --------------
A non-text attachment was scrubbed...
Name: grub2.patch
Type: text/x-diff
Size: 1032 bytes
Desc: not available
Url : http://lists.alioth.debian.org/pipermail/pkg-grub-devel/attachments/20080511/7658484b/attachment.patch 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: vallog
Url: http://lists.alioth.debian.org/pipermail/pkg-grub-devel/attachments/20080511/7658484b/attachment.txt 


More information about the Pkg-grub-devel mailing list