jessie: help debugging NFS shares not mounted at boot, double mounts with mount -a, and @reboot cronjobs

Sandro Tosi morph at debian.org
Fri Jan 29 16:23:45 GMT 2016


Hey Felipe,
thanks for your reply and sorry for getting back so late

On Mon, Jan 25, 2016 at 2:21 PM, Felipe Sateler <fsateler at debian.org> wrote:
> On 25 January 2016 at 09:04, Sandro Tosi <morph at debian.org> wrote:
>> Hello,
>> we are converting our installations to jessie and we are facing some
>> issues, which we believe is somehow related to how systemd boots the
>> system, and that they can be summarized as (more details below):
>>
>> 1. some NFS shares are not mounted at boot
>> 2. using 'mount -a' as @reboot cronjob (to workaround 1.) duplicates
mounts
>> 3. @reboot jobs for programs on NFS mountpoints are started too early
>> (when the share is not yet mounted)
>
> I'll try to focus on the original problem first (failed mounts).

yup, makes totally sense!

>
>>
>>
>> (1.) we believe that the NFS shares not mounted at boot is also
>> causing the other problems, so let's start from this one.
>>
>> we define our NFS mounts in /etc/fstab like this:
>>
>> NFS_SERVER:VOLUME    MOUNTPOINT      nfs
>> rw,intr,tcp,bg,rdirplus,noatime,_netdev
>>
>> the mount options are always the same, and we have several of these
>> mounts (from different servers), even up to 20 on some hosts.
>>
>> some of those (not always the same and not on all the machines), are
>> not mounted during the boot.
>
> Logs would be helpful (`journalctl -axb` would get you the current boot's
logs).

sure, attached is the output of journalctl -axb on a system where only 1 of
the 10 NFS shares in /etc/fstab was not mounted (the output is lightly
anonymized, but it should have preserved all the relevant information).

> Also, what are you using for networking? Ifupdown?

yup, we configure all our network interfaces in /etc/network/interfaces and
they are all "auto <iface>"


> Does `systemctl
> status` print failed units?

yeah, here are some additional commands output:

# systemctl status
● SERVER
    State: degraded
     Jobs: 0 queued
   Failed: 1 units

# systemctl | grep failed
● mnt-NFSSERVER.mount
                  loaded failed failed    /mnt/POINT

# systemd-analyze critical-chain
The time after the unit is active or started is printed after the "@"
character.
The time the unit takes to start is printed after the "+" character.

graphical.target @1min 32.289s
└─multi-user.target @1min 32.289s
  └─postfix.service @1min 32.101s +187ms
    └─nss-lookup.target @1min 32.097s
      └─unbound.service @1min 31.804s +292ms
        └─basic.target @1min 31.793s
          └─paths.target @1min 31.792s
            └─acpid.path @1min 31.792s
              └─sysinit.target @1min 31.790s
                └─console-setup.service @1min 31.719s +71ms
                  └─kbd.service @1min 31.698s +20ms
                    └─remote-fs.target @1min 31.697s
                      └─mnt-NFSSERVER.mount @1.568s +1min 30.128s
                        └─network-online.target @1.561s
                          └─network.target @1.561s
                            └─networking.service @246ms +1.314s
                              └─local-fs.target @245ms

└─var-lib-hugetlbfs-global-pagesize\x2d2MB.mount @1min 31.822s
                                  └─local-fs-pre.target @233ms
                                    └─systemd-remount-fs.service @230ms +3ms
                                      └─keyboard-setup.service @145ms +84ms
                                        └─systemd-udevd.service @142ms +2ms

└─systemd-tmpfiles-setup-dev.service @118ms +23ms
                                            └─kmod-static-nodes.service
@109ms +6ms
                                              └─system.slice @106ms
                                                └─-.slice @106ms

>
> Without more information on how this is failing (ie, logs), it is very
> hard to help.

i was just not sure which information to provide, hopefully the ones here
are enough, but could provide more if needed

> BTW, be sure to order cron.service after remote-fs.target if your
> @reboot commands need those paths available, otherwise cron might run
> before they are mounted. This should solve problem 3.

yup, indeed we'll do.

An additional question: in the critical-chain output above, we see that the
network targets (network.target and network-only.target) are correctly
reached before the remote-fs.target, but.. how is systemd defining the
network as "online"? our doubt comes from the fact that sometimes (but not
in the machines i provided the log for) we have multiple physical NICs in
the machine, so is "online" as soon as one of those interfaces is up, or
only when all of them are configured and 'UP' in the "ip link" sense?

thanks a ton for your help!

Cheers,
-- 
Sandro "morph" Tosi
My website: http://matrixhasu.altervista.org/
Me at Debian: http://wiki.debian.org/SandroTosi
G+: https://plus.google.com/u/0/+SandroTosi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/pkg-systemd-maintainers/attachments/20160129/49da7578/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: boot_log.gz
Type: application/x-gzip
Size: 16237 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/pkg-systemd-maintainers/attachments/20160129/49da7578/attachment-0002.bin>


More information about the Pkg-systemd-maintainers mailing list