[pkg-ntp-maintainers] Bug#561622: Bug#561622: Race condition between ntp and NetworkManager at entry to runlevel 2

Fri Dec 18 23:22:35 UTC 2009

To answer your questions:

/etc/resolv.conf looks like this when I boot up single-user:

domain local
search local
nameserver 192.168.1.1

Note that in single-user the network interfaces are up
and configured, and dhclient is running (but NetworkManager
is not running).  (Is it possible to boot Debian without
networking?)

There is no nameserver running on the Debian machine, but
there is one (DNSMasq) in my router (192.168.1.1).

The hosts line of /etc/nsswitch.conf:

hosts:    files mdns4_minimal [NOTFOUND=return] dns mdns4

And some extra information: the servers lines from /etc/ntp.conf:

server 0.debian.pool.ntp.org iburst dynamic
server 1.debian.pool.ntp.org iburst dynamic
server 2.debian.pool.ntp.org iburst dynamic
server 3.debian.pool.ntp.org iburst dynamic

Also note that I am using dhcp for network configuration.
The dhcp server is in my router (192.168.1.1).

---

First, I don't think that ntpd is getting a permanent error
from the router's nameserver: syslog indicates that the
network interface is down at the time, so the router is
unreachable.

Second, it seems that the 'dynamic' option in ntp.conf
should tell ntpd to keep trying to resolve the server
addresses and contact the servers.  Yet I can assure you
that if ntpq -p says "No association ID's..." right
after boot, it will still say "No association ID's..."
an hour later.  One of us should probably look at the
ntp source code.

Third, I'm mystified why you don't have this problem,
but I'm not the only one who has it (see #535049).

The good news is that /etc/network/if-up.d/ntp seems
to be a good workaround for me.

________________________________
From: Kurt Roeckx <kurt at roeckx.be>
To: Carl Mascott <cmascott at yahoo.com>; 561622 at bugs.debian.org
Sent: Fri, December 18, 2009 5:03:12 PM
Subject: Re: [pkg-ntp-maintainers] Bug#561622: Race condition between ntp and NetworkManager at entry to runlevel 2

On Fri, Dec 18, 2009 at 11:40:15AM -0800, Carl Mascott wrote:
> Subject: Race condition between ntp and NetworkManager at entry to runlevel 2
> Package: ntp
> Version: 1:4.2.4p4+dfsg-8lenny3
> Severity: important
> 
> *** Please type your report below this line ***
> Related bug: 535049
> Symptom: On some startups ntpq -p reports "No association ID's...".
>     On other startups ntpq -p reports servers & status normally.
>     I.E., problem is intermittent.
> Cause: When ntpq -p reports "No association ID's..." it is because
>     NetworkManager has not yet brought up the network interface
>     (eth0 for me) when ntpd is trying to resolve server IP
>     addresses.  If ntpd gives up on resolving addresses at
>     this point it will never try again later.
> Effect: ntpd does not provide time unless restarted manually.
> Scope: The symptom I see is a problem with ntp.  The underlying
>     cause could affect other network services as well.
> Non-solution: Reorder startup scripts in /etc/rc2.d.
> Correct solution: ??

I have no idea why ntpd is saying that resolving failed.  But that
means it got a permanent error resolving from somewhere.  I have
no idea how your configuration is set up, but I never get such
permanent error.

I can perfectly bring up my wireless after some time, and ntpd
will pick up the servers after some time.  Note that it doesn't
directly get them when the network gets up, but it does get them.

Anyway, the current ntp init script has this in it:
# Required-Start:  $network

I have no idea how that interacts with network manager, but I
would expect atleast some basic things to work.

Do you know what's in your /etc/resolv.conf before the network
is brought up by network manager?  Do you have some sort of
local nameserver?  What does "grep ^hosts /etc/nsswitch.conf"
return?

What I find weird in your log is:
> Dec 18 09:26:08 ganymede ntpd[2741]: signal_no_reset: signal 17 had flags 4000000

I've never seen that message before.  It seems the child sees
an old signal halder for SIGCHLD with SA_RESTORER set.  (I have
no idea why ntpd is logging this.)  This is probably unrelated
to your problem.

> Workaround: Use modified version of etch's /etc/network/if-up.d/ntp
>     script as follows:

There should not be a need to restart ntpd, I think something
else is broken.

Kurt

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-ntp-maintainers/attachments/20091218/26123f15/attachment.htm>