Bug#841662: libserver-starter-perl: test suite sometimes times out

Niko Tyni ntyni at debian.org
Sun Oct 30 18:03:49 UTC 2016


On Sat, Oct 22, 2016 at 08:15:46PM +0200, gregor herrmann wrote:
> On Fri, 21 Oct 2016 23:13:53 +0300, Niko Tyni wrote:
> 
> > This package occasionally fails its autopkgtest checks on ci.debian.net.
> > 
> >   https://ci.debian.net/packages/libs/libserver-starter-perl/unstable/amd64/

I've been looking at this for half a day, and it's annoyingly hard to
reproduce. Running t/01-starter.t in a loop, I've seen it deadlock a
dozen times or so altogether. When it happens, strace shows the child
is calling accept() and its parent is waiting for it to exit.

Adding instrumentation mostly makes it go away. It does seem like the
parent killing the child with TERM succeeds, but the child never executes
its $SIG{TERM} handler. I haven't been able to figure out why. Perhaps
the handler gets interrupted by another signal - my first thought was
SIGPIPE but adding a handler for that didn't show anything.

Given it fails somewhat regularly on both ci.debian.net and
tests.reproducible-builds.org, possibly a faster machine would improve
the chances of reproducing it.  Just getting the log of 'strace -f
-olog prove -l t/01-starter.t' when it locks up would help tremendously,
but I ran it for two hours or so like that without a single lockup.

OTOH, reading https://rt.cpan.org/Public/Bug/Display.html?id=73711
I get the impression that the test suite is riddled with races that
are worked around by sprinkling sleep() calls in the test code.

Even though it feels like giving up, I suggest either disabling the test
suite or somehow guarding it with a timeout and making failures non-fatal.

Perhaps we should devise something very simple instead for a single
basic test?
-- 
Niko Tyni   ntyni at debian.org



More information about the pkg-perl-maintainers mailing list