Bug#785557: perl: FTBFS on i386 and amd64: itimer problems on buildds?

Dominic Hargreaves dom at earth.li
Mon Jun 1 15:14:32 UTC 2015


On Sun, May 24, 2015 at 07:38:19PM +0300, Apollon Oikonomopoulos wrote:
> On 16:38 Sun 24 May     , Ben Hutchings wrote:
> > On Sun, 2015-05-24 at 14:09 +0300, Niko Tyni wrote:
> > > On Sun, May 24, 2015 at 02:55:00PM +0800, Paul Wise wrote:
> > > > On Sat, 2015-05-23 at 19:10 +0200, Dominic Hargreaves wrote:
> > > > 
> > > > > This is rather strange; any ideas from DSA?
> > > > 
> > > > The underlying hosts do not have the same issue.
> > > > 
> > > > All of the guests use the same virtual CPU version/flags.
> > > > 
> > > > All of the guests use the same Linux kernel version.
> > > 
> > > Thanks for the update.
> > > 
> > > > I guess diving into the Linux implementation of times(2) for clues would
> > > > be the next step for figuring out what the issue is here.
> > > 
> > > I'm taking the kernel maintainers in the loop. The status here is that
> > > times(2) seems to be misbehaving on some i386 and amd64 debian.org virtual
> > > hosts running jessie (under ganeti/qemu, with jessie on the underlying
> > > hosts too). These hosts include at least barriere and x86-grnet-01.
> > > 
> > > The misbehaviour is that user time stays at zero all the time, as seen
> > > for example with 'time yes'. This is making perl fail to build from
> > > source due to test failures, and I'd expect it to affect other things too.
> > > 
> > > Any help is appreciated.
> > 
> > I can't reproduce this, but wonder if it's related to #784960?
> 
> There seems to be something fundamentally broken in 
> barriere.debian.org's CPU time accounting, not related to times(2) per 
> se. Just issuing
> 
>   yes >/dev/null
> 
> and firing up top -d1 gives the following interesting results:
> 
>   - `yes' shows up taking 100% CPU time as expected, but
>   - pressing `1' shows that all CPUs are idle (!)
> 
> htop OTOH displays all CPUs as constantly 100% busy, which is 
> inconsistent with the system's load average (~0.8 at that point).
> 
> Also watching the output of `cat /proc/$(pidof yes)/stat | awk '{ print 
> $14, $15 }'' ($14 is utime, $15 is stime per proc(5)) indeed shows 100% 
> system time and 0 user time.
> 
> If you look at the `top' stats for all CPUs of barriere.debian.org, it 
> looks as if the only thing that's correctly being accounted for is 
> iowait time.

It looks like the same thing has happened again on x86-grnet-01, meaning
we have issues[1] on

x86-grnet-01
brahms
binet

but not

babin
x86-csail-01

Buildd admins: please can the amd64 build of perl 5.22.0~rc2-2 be
given-back to see if it lands on a working host?

DSA: can you identify any differences between the working hosts and the
others which would help identify the root of this problem - assuming that
they all exhibit the same easy to reproduce behaviour seen above?

Thanks!
Dominic.

[1] <https://buildd.debian.org/status/logs.php?pkg=perl&arch=amd64>




More information about the Perl-maintainers mailing list