Bug#642497: Bug#644601: [xserver-xorg-core] hard lock-up: [mi] EQ overflowing

Andreas Beckmann debian at abeckmann.de
Sun Oct 16 19:03:28 UTC 2011


[moving discussion back to your report #642497]

On 2011-10-13 12:44, JS wrote:
> Perhaps a conflict with nouveau should be added to the nvidia package > to
> avoid this possibility, based on the warnings from NVIDIA.

Adding conflicts is not a good solution as we want to allow switching
between free and non-free drivers without requiring installing or
removing packages. Furthermore adding such a conflict may render a lot
of unrelated packages uninstallable. (Think of a live CD that has all
sorts of hardware support installed and some clever piece of hardware
detection that enables/disables the right things during boot. E.g.
nvidia and fglrx proprietary drivers can now be installed in parallel,
even if a "normal" system will use at most one of them.)
But eventually the problematic nouveau files can be diverted (like MESA
libGL) and be reenabled depending on the setting of the glx alternative
... but first we have to find out whats causing the problems.

On 2011-10-16 16:01, JS wrote:
> I had a similar problem with the nvidia driver which resulted in
> easy-to-reproduce X server lockups. The Xorg.0.log show the "EQ overflowing"
> message followed by message that the xserver was in an infinite loop.
> 
> The bug is 642497:
>     http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=642497
> 
> It was fixed by purging the xserver-xorg-video-nouveau  followed by reinstall
> of the nvidia drivers and xserver.

Do you have libdrm-nouveau1a and/or libdrm-nouveau1 still installed?

> If one tries to install the nvidia driver directly from the NVIDIA blog,
> there is a warning that the presence of shared libs from nouveau may cause
> problems. After doing this reinstall I've tested extensively and had no problems.

Purging and reinstalling should not be neccessary, a restart of the X
server following the package installation/removal should be sufficient.
Eventually a system reboot could be necessary to return the GPU into a
defined state (in case both nvidia-driver and nouveau-whatever tried to
initialize the card in "their" way).

Since you had an easily reproducible way to trigger the problem ...
could you test something more? At the point where X hangs, is the
machine still usable? E.g. can you get into a console (Ctrl-Alt-F1 etc)
or SSH into the machine?

If libdrm-nouveau1a is not installed - install it (but not
xserver-xorg-nouveau) and test again.
If this does not trigger the problem, add xserver-xorg-nouveau and test.
You should get back to "working" state by just uninstalling these two
packages.

Once the xserver got stuck, run

    lsof -n -P | grep nouveau

from a console/ssh to see whether something is currently using a nouveau
file.

Luckily there are only two nouveau specific libraries:
/usr/lib/xorg/modules/drivers/nouveau_drv.so
/usr/lib/x86_64-linux-gnu/libdrm_nouveau.so.1


Thanks.

Andreas





More information about the pkg-nvidia-devel mailing list