[Pkg-xen-devel] Bug#880554: xen domu freezes with kernel linux-image-4.9.0-4-amd64

Hans van Kranenburg hans at knorrie.org
Fri Jan 12 00:34:10 UTC 2018


Hi,

On 08/01/2018 13:38, Valentin Vidic wrote:
> On Sun, Jan 07, 2018 at 07:36:40PM +0100, Hans van Kranenburg wrote:
>> Recently a tool was added to "dump guest grant table info". You could
>> see if it compiles on the 4.8 source and see if it works? Would be
>> interesting to get some idea about how high or low these numbers are in
>> different scenarios. I mean, I'm using 128, you 256, and we even don't
>> know if the actual value is maybe just above 32? :]
>>
>> https://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=df36d82e3fc91bee2ff1681fd438c815fa324b6a
> 
> The diag tool does not build inside xen-4.8:
> 
> xen-diag.c: In function ‘gnttab_query_size_func’:
> xen-diag.c:50:10: error: implicit declaration of function ‘xc_gnttab_query_size’ [-Werror=implicit-function-declaration]
>      rc = xc_gnttab_query_size(xch, &query);
>           ^~~~~~~~~~~~~~~~~~~~

Too bad. :|

> but I think the same info is available in the thread on xen-devel:
> 
>   https://www.mail-archive.com/xen-devel@lists.xen.org/msg116910.html

Ah, great, didn't see that one yet.

> When the domU hangs crash reports nr_grant_frames=32. After increasing
> the gnttab_max_frames=256 the domU reports using nr_grant_frames=59.
> 
> So the new default of gnttab_max_frames=64 might be a bit close to 59,
> but I suppose 128 would be just as safe as 256 I currently use (if
> you prefer 128).

Is the 59 your lots-o-vcpu-monster?

I just finished with the initial preparation of a Xen 4.10 package for
unstable and have it running in my test environment.

So, yay, I have xen-diag now.

-# /usr/lib/xen-4.10/bin/xen-diag
xen-diag: xen diagnostic utility
Usage: xen-diag command [args]
Commands:
  help                       display this help
  gnttab_query_size <domid>  dump the current and max grant frames for
<domid>

-# /usr/lib/xen-4.10/bin/xen-diag gnttab_query_size 0
domid=0: nr_frames=1, max_nr_frames=64

That's a 10vcpu PVHv2 domU with two disks attached, running 4.14 guest
kernel, which has only been booted up and is idling now.

So, at least, nice to have some extra tooling available to help.

>> If this is something users are going to run into while not doing more
>> unusual things like having dozens of vcpus or network interfaces, then
>> changing the default could prevent hours of frustration and debugging
>> for them.
> 
> Yes, the failure case is quite nasty, as the domU just hangs without
> even suggesting grant frames might be the problem. Not sure if domU
> can detect this situation at all?

I can't comment on that, since I don't know. Anyone who does, please
chime in.

> Anyway, if the value cannot be increased, the situation should at least
> be mentioned in the NEWS.Debian of the xen package.

Since this has been reported multiple times already, and upstream has
bumped it to 64, my verdict would be:

* Bump default to 64 already like upstream did in a later version.
* Properly document this issue in NEWS.Debian and also mention the
option with documentation in the template grub config file, so there's a
bigger chance users who run unusual big numbers of disks/nics/cpus/etc
will find it.

...so we also better accomodate users who are using newer kernels in the
domU with blk-mq, and prevent them from wasting too much time and
getting frustrated for no reason.

I wouldn't be comfortable with bumping it above the current latest
greatest upstream default, since it would mean we would need to keep a
patch in later versions.

I'll prepare a patch to bump the default to 64 in 4.8, taking changes
from the upstream patch. I probably have to ask upstream (Juergen Gross)
why the commit that was referenced earlier bumps the default without
mentioning it in the commit message.

Since I just joined the Debian Xen team, I'll run anything I can come up
with through the team to get it approved. We'll target the next Stretch
stable update to get it in.

Thanks,
Hans



More information about the Pkg-xen-devel mailing list