Bug#666631: libsystem-command-perl: FTBFS: tests failed

Niko Tyni ntyni at debian.org
Tue Apr 3 19:47:11 UTC 2012


tag 666631 - unreproducible
thanks

On Sun, Apr 01, 2012 at 04:06:21PM +0200, gregor herrmann wrote:
> tag 666631 + unreproducible
> thanks
> 
> On Sat, 31 Mar 2012 21:48:42 +0200, Lucas Nussbaum wrote:
> 
> > During a rebuild of all packages in sid, your package failed to build on
> > amd64.
> 
> Builds fine here (amd64 + i386 sid cowbuilder chroot).

I can reproduce this by loading the build host during the test suite.

It looks like the test assumes that in a sequence of

 $SIG{CHLD} = 'IGNORE';
 waitpid($pid, 0)
 kill 0, $pid

the kill can never succeed because the child process is already gone.

However, this doesn't seem to be the case, as seen by

#!/usr/bin/perl -w

$SIG{CHLD} = 'IGNORE';
use Time::HiRes q/time/;

my $i = 0;
my $start;

while (1) {
    my $pid = fork();
    die "cannot fork" unless defined $pid;
    if ($pid == 0) {
        print "$$: exiting\n";
        exit 0;
    } else {
        my $w = waitpid($pid, 0);
        print "$$: waitpid $pid returned $w: $!\n";
        while (kill 0, $pid) { 
            $start = time if !$i++;
        }
        if ($i) {
            die "$i loops in " . (time() - $start) . " seconds";
        }
    }
}
__END__

which under load finally quits here with something like
 32196 loops in 0.00708603858947754 seconds at - line 22.

showing there was a window of seven milliseconds where the
reaped child process could still be signaled.

I even caught strace output:

write(1, "13214: waitpid 22869 returned -1"..., 33) = 33
kill(22869, SIG_0)                      = -1 ESRCH (No such process)
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fdc0cc509d0) =
 22870
wait4(22870, 0x7fff070762ec, 0, NULL)   = -1 ECHILD (No child processes)
write(1, "13214: waitpid 22870 returned -1"..., 33) = 33
kill(22870, SIG_0)                      = 0
kill(22870, SIG_0)                      = 0
kill(22870, SIG_0)                      = 0
[...]
kill(22870, SIG_0)                      = 0
kill(22870, SIG_0)                      = 0
kill(22870, SIG_0)                      = -1 ESRCH (No such process)
write(2, "395 loops in 0.00786995887756348"..., 58) = 58

Quoting waitpid(2):

       ECHILD (for  waitpid() or waitid()) The process specified by pid (wait‐
              pid()) or idtype and id (waitid()) does not exist or  is  not  a
              child  of  the  calling process.  (This can happen for one's own
              child if the action for SIGCHLD is set to SIG_IGN.  See also the
              Linux Notes section about threads.)

so it looks like the error can't be trusted when SIGCHLD is set to SIG_IGN.

No patch, and my time's up for tonight.
-- 
Niko Tyni   ntyni at debian.org





More information about the pkg-perl-maintainers mailing list