[Pkg-utopia-maintainers] Bug#818362: connection timeouts while a large number of containers boot

Simon McVittie smcv at debian.org
Thu Mar 17 22:48:39 UTC 2016


Control: retitle 818362 connection timeouts while a large number of containers boot
Control: forwarded 818362 https://bugs.freedesktop.org/show_bug.cgi?id=74788

On Thu, 17 Mar 2016 at 12:47:06 +0100, Harald Dunkel wrote:
> > How many LXC containers are you booting, on what hardware, and what service is connecting to the system bus and getting rejected?
> 
> My test case is 31 containers on a quad core (+ht) Xeon E5420 CPU.
> The production machine is a 2 * 6core (+ht) Xeon E5-2630 running
> about 20 containers. On both hosts it can take 5 or 10 minutes until
> the last container gets its IP address via network (DHCP).

It might be a useful workaround to stagger startup so not everything
is starting at the same time?

I'm a little surprised this is necessary, though; the timeout is
reasonably generous, and the handshake that the connections have to do
before the timeout is hit is relatively small and shouldn't involve
any significant I/O or computation.

> Wouldn't you agree that a high watermark on the number of used
> connection slots to enable the timeout restriction would have been
> a better choice?

Thanks, I've noted that suggestion upstream on
<https://bugs.freedesktop.org/show_bug.cgi?id=74788>.
Because this was treated as a security issue, the initial solution
was developed under embargo and designed to be minimal/targeted,
but that doesn't mean we can't improve on it later.

I might not be able to implement this soon, but I'd be happy to
review patches from anyone interested in making this more scalable.

> Probably its reasonable to ignore the timeout for uid0, but surely it
> will take some time till this change appears in a future Debian release.

This is the price we have to pay for a stable distribution: we avoid
changing anything non-critical in stable because it might introduce
a regression, but then non-critical bugs don't get fixed for a while.
If someone improves this in development versions of dbus-daemon, there'll
at least be something that you could backport locally if you have machines
that are hit particularly badly (like the LXC host you've described).

    S



More information about the Pkg-utopia-maintainers mailing list