Bug#642497: Bug#644601: [xserver-xorg-core] hard lock-up: [mi] EQ overflowing

JS jshaio at yahoo.com
Sun Oct 16 21:52:54 UTC 2011


I'm not yet too familiar with the deeper issues you mention regarding
adding conflicts (having only recently changed from an rpm-based system).

However, trying to install the drivers directly from the NVIDIA .run file
does bring up an explicit warning about the nouveau driver (and
no warnings regarding other drivers, just this one). I just reviewed this
warning again and it regards nouveau driver being in use even when X is
not running, and potentially also being present in the initrd.
  NVIDIA: " If you have an initrd which loads the Nouveau driver, you will additionally
   need to ensure that Nouveau is disabled in the initrd. If your initrd
   understands the rdblacklist parameter, you can add the option
   rdblacklist=nouveau to your kernel's boot parameters."
[I was mistaken when I said it was the nouveau shared libs that were the issue.]

I do have this version of libdrm-nouveau1a installed (with no problems at all):
ii  libdrm-nouveau1a                     2.4.26-1  


The problem I reported was purely X server; I could not use the keyboard
to switch to another console. But there was never any problem getting in
with ssh from another machine, examining logs and initiating a graceful
restart.

The set of packages related to this issue that I'm currently using
(and are now pinned) is:
ii  glx-alternative-mesa                 0.1.94
ii  glx-alternative-nvidia               0.1.94                    
ii  glx-diversions                       0.1.94                    
ii  libdrm-nouveau1a                     2.4.26-1                         
ii  libegl1-mesa                         7.11-6                    
ii  libegl1-mesa-drivers                 7.11-6                    
ii  libgl1-mesa-dri                      7.11-6                    
ii  libgl1-mesa-glx                      7.11-6                    
ii  libgl1-nvidia-alternatives           280.13.really.275.28-1           
ii  libgl1-nvidia-glx                    280.13.really.275.28-1           
ii  libglapi-mesa                        7.11-6                    
ii  libgles2-mesa                        7.11-6                    
ii  libglu1-mesa                         7.11-6                    
ii  libglw1-mesa                         7.11-6                    
ii  libglx-nvidia-alternatives           280.13.really.275.28-1           
ii  libopenvg1-mesa                      7.11-6                    
ii  libosmesa6                           7.11-6                    
ii  libva-glx1                           1.0.12-2                         
ii  libxcb-glx0                          1.7-3                            
ii  libxcb-glx0-dev                      1.7-3                     
ii  mesa-common-dev                      7.11-6                    
ii  mesa-utils                           8.0.1-2+b1                       
ii  nvidia-alternative                   280.13.really.275.28-1           
ii  nvidia-detect                        280.13.really.275.28-1           
ii  nvidia-glx                           280.13.really.275.28-1           
ii  nvidia-installer-cleanup             20110729+2                
ii  nvidia-kernel-common                 20110729+2                
ii  nvidia-kernel-dkms                   280.13.really.275.28-1           
ii  nvidia-settings                      280.13-1                         
ii  nvidia-support                       20110729+2                
ii  nvidia-vdpau-driver                  280.13.really.275.28-1           
ii  nvidia-xconfig                       280.13-1                         
ii  xserver-xorg-core                    2:1.10.2.902-1                   
ii  xserver-xorg-video-nvidia            280.13.really.275.28-1 

[in addition, xserver-xorg-video-nouveau now has Pin-Priority=-1]

--- On Sun, 10/16/11, Andreas Beckmann <debian at abeckmann.de> wrote:

> From: Andreas Beckmann <debian at abeckmann.de>
> Subject: Bug#642497: Bug#644601: [xserver-xorg-core] hard lock-up: [mi] EQ overflowing
> To: "JS" <jshaio at yahoo.com>, 642497 at bugs.debian.org
> Date: Sunday, October 16, 2011, 3:03 PM
> [moving discussion back to your
> report #642497]
> 
> On 2011-10-13 12:44, JS wrote:
> > Perhaps a conflict with nouveau should be added to the
> nvidia package > to
> > avoid this possibility, based on the warnings from
> NVIDIA.
> 
> Adding conflicts is not a good solution as we want to allow switching
> between free and non-free drivers without requiring installing or
> removing packages. Furthermore adding such a conflict may render a lot
> of unrelated packages uninstallable. (Think of a live CD that has all
> sorts of hardware support installed and some clever piece of hardware
> detection that enables/disables the right things during boot. E.g.
> nvidia and fglrx proprietary drivers can now be installed in parallel,
> even if a "normal" system will use at most one of them.)
> But eventually the problematic nouveau files can be diverted (like MESA
> libGL) and be reenabled depending on the setting of the glx alternative
> ... but first we have to find out whats causing the problems.
> 
> On 2011-10-16 16:01, JS wrote:
> > I had a similar problem with the nvidia driver which resulted in
> > easy-to-reproduce X server lockups. The Xorg.0.log show the "EQ overflowing"
> > message followed by message that the xserver was in an infinite loop.
> > 
> > The bug is 642497:
> >     http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=642497
> > 
> > It was fixed by purging the xserver-xorg-video-nouveau  followed by reinstall
> > of the nvidia drivers and xserver.
> 
> Do you have libdrm-nouveau1a and/or libdrm-nouveau1 still installed?
> 
> > If one tries to install the nvidia driver directly from the NVIDIA blog,
> > there is a warning that the presence of shared libs from nouveau may cause
> > problems. After doing this reinstall I've tested extensively and had no problems.
> 
> Purging and reinstalling should not be neccessary, a restart of the X
> server following the package installation/removal should be sufficient.
> Eventually a system reboot could be necessary to return the GPU into a
> defined state (in case both nvidia-driver and nouveau-whatever tried to
> initialize the card in "their" way).
> 
> Since you had an easily reproducible way to trigger the problem ...
> could you test something more? At the point where X hangs, is the
> machine still usable? E.g. can you get into a console (Ctrl-Alt-F1 etc)
> or SSH into the machine?
> 
> If libdrm-nouveau1a is not installed - install it (but not
> xserver-xorg-nouveau) and test again.
> If this does not trigger the problem, add xserver-xorg-nouveau and test.
> You should get back to "working" state by just uninstalling
> these two packages.
> 
> Once the xserver got stuck, run
> 
>     lsof -n -P | grep nouveau
> 
> from a console/ssh to see whether something is currently
> using a nouveau
> file.
> 
> Luckily there are only two nouveau specific libraries:
> /usr/lib/xorg/modules/drivers/nouveau_drv.so
> /usr/lib/x86_64-linux-gnu/libdrm_nouveau.so.1
> 
> 
> Thanks.
> 
> Andreas
> 
> 
> 
> -- 
> To unsubscribe, send mail to 642497-unsubscribe at bugs.debian.org.
> 





More information about the pkg-nvidia-devel mailing list