[Pkg-sysvinit-devel] Bug#653073: bug#10363: /etc/mtab -> /proc/mounts symlink affects df(1) output for

Goswin von Brederlow goswin-v-b at web.de
Wed Jan 18 14:25:05 UTC 2012


"Alan Curry" <pacman-cu at kosh.dhis.org> writes:

> jidanni at jidanni.org writes:
>> 
>>   Filesystem                                             1K-blocks    Used Available Use% Mounted on
>>   rootfs                                                   1071468  287940    729100  29% /
>>   /dev/disk/by-uuid/551e44e1-2cad-42cf-a716-f2e6caf9dc78   1071468  287940    729100  29% /
>
> (I'm replying only on the issue of the duplicate mount point. Someone else
> can tackle the long ugly name.)
>
> The one with "rootfs" as its device is the initramfs which you automatically
> get with all recent kernels. Even if you aren't using an initramfs, there's
> an empty one built into the kernel which gets mounted as the first root
> filesystem. The real root gets mounted on top of that.
>
> So this is a special case of a general problem with no easy solution: What
> should df do when 2 filesystems are mounted at the same location? It can't
> easily give correct information for both of them, since the later mount
> obscures the earlier mount from view.

The problem also exists in a larger extend with chroots. There will be
lots of entries from outside the chroot that are inaccessible to a df
running inside the chroot.

What df should do is automatically skip the entries that are obscured or
generally inaccessible. Unfortunately the kernel does not (re)sort the
entries correctly following a mount --move call:

rootfs / rootfs rw 0 0
none /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
none /proc proc rw,nosuid,nodev,noexec,relatime 0 0
none /dev devtmpfs rw,relatime,size=491516k,nr_inodes=122879,mode=755 0 0
none /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
/dev/mapper/s-root / ext3 ro,relatime,errors=remount-ro,data=ordered 0 0
tmpfs /lib/init/rw tmpfs rw,nosuid,relatime,mode=755 0 0
...

Going by that list the /dev/mapper/s-root filesystems obscures the
rootfs, /sys, /proc, /dev and /dev/pts. In reality though only the
rootfs is obscured because the rest was moved prior to the initramfs
switching / around. What the kernel should do is move the relevant
entries so they appear below the filesystem they are moved to (and
before any that do obscure them, moving them to the bottom isn't always
the right solution).

So at the moment is a bit of a guess which entries are real and which
are obscured. The best you can do is check that each entry is actually a
mountpoint and guess that the last of identical mountpoints is the right
one.

> If there's a way for df to get the correct information for the lower mount, I
> don't know what it would be. If you have a process with a leftover cwd or
> open fd in the obscured filesystem, you can use that. But generally you
> won't.

There afaik isn't and there should not be a way to do so.

> But maybe we could do better than reporting incorrectly that the lower mount
> has size and usage identical to the upper mount! At least df could print a
> warning at the end if it has seen any duplicate entries. Perhaps there is
> some way it could figure out which one is on top, and print a bunch of
> question marks as the lower mount's statistics.

Maybe compare the major/minor of the device node with statfs() output.

> If df is running as root, it might be able to unshare(2) the mount namespace,
> unmount the upper level, and then statfs the mount point again to get the
> correct results for the lower level. That won't work in all cases (even in a
> private namespace you can't unmount the filesystem containing your own cwd)
> and it does nothing for you if you're not root, but still... it would be a
> cool bonus in the cases where it does work.
>
> As a special case, "rootfs" should probably be excluded from the default
> listing, since the initramfs is not very interesting most of the time. It
> could still be shown with the -a option, although it would always have the
> wrong statistics. Or if you really want to be impressive, default to showing
> the initramfs if and only if it is the only thing mounted on "/" - so you can
> run df within the initramfs before the real root is mounted and get the right
> result.

What if you only have a rootfs?

Imho the /proc/mounts file should only contain entries visible in the
processes mount namespace. So for normal systems the rootfs shouldn't
appear and in chroots the list should be even shorter.

> Or... (brace yourself for the most bold idea yet)... can you imagine a kernel
> interface that would *cleanly* give access to obscured mount points?

I fear that would let too much information escape from/into the mount
namesapces.

But there could be a /proc/global-mounts or something that is only
readable from the root namespace.

> Comments on any of the above? Do the BSDs have any bright ideas we can steal,
> or is their df as embarrassingly bad at handling obscured mount points as
> ours?

MfG
        Goswin





More information about the Pkg-sysvinit-devel mailing list