[pkg-nvidia-devel] Bug#441975: nvidia-glx should only provide the TLS version

Chris Reeves chris.reeves at iname.com
Thu Jul 24 00:16:11 UTC 2008


On Fri, Feb 15, 2008 at 18:59:30 +0100, Aurelian Jarno wrote:
> 
> severity 441975 serious
> 
> On Sat, Jan 19, 2008 at 12:57:42AM +0100, Aurelien Jarno wrote:
> > 
> > FYI the problem is that /etc/ld.so.nohwcaps disable all optimized
> > libraries and use the one from /usr/lib. NVidia had the idea to provide
> > a TLS version (in /usr/lib/tls) and a non-TLS version (in /usr/lib) of
> > their library. Disabling optimized libraries means that the non-TLS
> > version of the library is used. However, their code chose between TLS
> > and non-TLS code on a different way (a test code), which always succeed
> > on recent systems with NPTL library. This lead to a mix of TLS and
> > non-TLS code, leading to a crash.
> > 
> > I will workaround to the glibc to also use tls/ directory even when
> > optimized libraries are disabled, as TLS is alway available in lenny.
> 
> This workaround causes problems when upgrading from etch to lenny, so it
> will be removed in the next upload. As a consequence, this bug really
> has to be fixed, so I am upgrading it to serious.

I have been able to reproduce this on a lenny machine with a 2.6.25-2 kernel.
In order to do so one must use an nVidia graphics card with the nVidia binary
driver and /etc/ld.so.nohwcaps must exist. The test (as described in a
previous message) is that "perl -e 'use Qt'" will segfault.

This bug will affect any user of the nvidia-glx package who has their debconf
frontend set to kde (or similar) and tries to upgrade a package which makes
use of /etc/ld.so.nohwcaps (e.g. libc6). These users will be affected
irrespective of whether nvidia-graphics-drivers makes it into lenny or not.


Aurelian is largely correct with this. The nVidia installer comes with two
different copies of libnvidia-tls.so.<version> inside the installer package.
 - According to the nvidia-installer docs, the version in
   <package-dir>/usr/lib is for glibc <= 2.2, while the version in
   <package-dir>/usr/lib/tls is for glibc >= 2.3. 
 - According to the README.Debian for nvidia-glx, however, the differing
   versions are for 2.4 and 2.6 kernels (presumably on the assumption that
   NPTL is implemented in the latter and not in the former).
Whichever of these interpretations is actually correct, the same version of
the library should be installed into both /usr/lib and /usr/lib/tls so that
the presence of /etc/ld.so.nohwcap does not affect which version of the
library is used (which it shouldn't).

On the basis of the nVidia docs it might seem reasonable to only ship the
second version, since lenny is guaranteed to come with glibc >= 2.3. In this
case we only require a one-line change to debian/rules to get things to work
(although the USE_TLS flag would become redundant and so we could also remove
related code and documentation).

On the other hand, if Randall's README.Debian is the more accurate, we might
break things for some users with older kernels. In this case it would take a
few more changes to get things to work (keep both versions of libnvidia-tls in
/usr/lib/nvidia and modify the init scripts to symlink both /usr/lib and
/usr/lib/tls to the same version).

My vote would be for the second option. It would be useful if people could
express their preferences so that I can produce a patch for the preferred
option. This would fix the nvidia-glx package, but does *not* fix the bug
completely.


As I said earlier, this bug will affect any of the users that I have
previously described, irrespective of whether an updated package makes it into
lenny - the presence/use of an old version of nvidia-glx will trigger this
bug. In order to actually fix the bug, nvidia-glx must be upgraded *before*
libc6 (or any other /etc/ld.so.nohwcap-using package).

My thoughts on this would be to make affected packages (e.g. libc6) Conflict
with nvidia-glx (< fixed-version). I'm no expert on how Debian/apt resolves
dependencies, so I'm not 100% sure whether this will result in:
 - removal of nvidia-glx;
 - no upgrade of affected packages;
 - or upgrade of nvidia-glx before affected packages (the desired result).
I'm also unsure of the politics of getting the affected packages to make the
required change, especially considering that they are probably frozen (e.g.
libc6).

Your thoughts and input would be much appreciated.

Cheers,
    Chris





More information about the Pkg-nvidia-devel mailing list