[pkg-nvidia-devel] nvidia-glx should only provide the TLS version

Aurelien Jarno aurelien at aurel32.net
Thu Jul 24 17:12:48 UTC 2008


Chris Reeves a écrit :
> On Fri, Feb 15, 2008 at 18:59:30 +0100, Aurelian Jarno wrote:
>> severity 441975 serious
>>
>> On Sat, Jan 19, 2008 at 12:57:42AM +0100, Aurelien Jarno wrote:
>>> FYI the problem is that /etc/ld.so.nohwcaps disable all optimized
>>> libraries and use the one from /usr/lib. NVidia had the idea to provide
>>> a TLS version (in /usr/lib/tls) and a non-TLS version (in /usr/lib) of
>>> their library. Disabling optimized libraries means that the non-TLS
>>> version of the library is used. However, their code chose between TLS
>>> and non-TLS code on a different way (a test code), which always succeed
>>> on recent systems with NPTL library. This lead to a mix of TLS and
>>> non-TLS code, leading to a crash.
>>>
>>> I will workaround to the glibc to also use tls/ directory even when
>>> optimized libraries are disabled, as TLS is alway available in lenny.
>> This workaround causes problems when upgrading from etch to lenny, so it
>> will be removed in the next upload. As a consequence, this bug really
>> has to be fixed, so I am upgrading it to serious.
> 
> I have been able to reproduce this on a lenny machine with a 2.6.25-2 kernel.
> In order to do so one must use an nVidia graphics card with the nVidia binary
> driver and /etc/ld.so.nohwcaps must exist. The test (as described in a
> previous message) is that "perl -e 'use Qt'" will segfault.
> 
> This bug will affect any user of the nvidia-glx package who has their debconf
> frontend set to kde (or similar) and tries to upgrade a package which makes
> use of /etc/ld.so.nohwcaps (e.g. libc6). These users will be affected
> irrespective of whether nvidia-graphics-drivers makes it into lenny or not.
> 
> 
> Aurelian is largely correct with this. The nVidia installer comes with two
> different copies of libnvidia-tls.so.<version> inside the installer package.
>  - According to the nvidia-installer docs, the version in
>    <package-dir>/usr/lib is for glibc <= 2.2, while the version in
>    <package-dir>/usr/lib/tls is for glibc >= 2.3. 
>  - According to the README.Debian for nvidia-glx, however, the differing
>    versions are for 2.4 and 2.6 kernels (presumably on the assumption that
>    NPTL is implemented in the latter and not in the former).
> Whichever of these interpretations is actually correct, the same version of

I guess the most correct interpretation is the one in README.Debian,
more precisely replacing 2.4 by non-NPTL and 2.6 by NPTL.

> the library should be installed into both /usr/lib and /usr/lib/tls so that
> the presence of /etc/ld.so.nohwcap does not affect which version of the
> library is used (which it shouldn't).

Or only the NPTL version in /usr/lib, as the non-NPTL version does not
exists anymore in Lenny.

> On the basis of the nVidia docs it might seem reasonable to only ship the
> second version, since lenny is guaranteed to come with glibc >= 2.3. In this
> case we only require a one-line change to debian/rules to get things to work
> (although the USE_TLS flag would become redundant and so we could also remove
> related code and documentation).
> 
> On the other hand, if Randall's README.Debian is the more accurate, we might
> break things for some users with older kernels. In this case it would take a
> few more changes to get things to work (keep both versions of libnvidia-tls in
> /usr/lib/nvidia and modify the init scripts to symlink both /usr/lib and
> /usr/lib/tls to the same version).

OTOH, as we switched to NTPL only in Lenny, older kernels (I mean 2.4
kernels) are not supported anymore. IIRC the minimum kernel is even
2.6.18 for i386. In that case it may be even more easier to remove the
initscript, and provide only one symlink in /usr/lib directly in the
package.

> My vote would be for the second option. It would be useful if people could
> express their preferences so that I can produce a patch for the preferred
> option. This would fix the nvidia-glx package, but does *not* fix the bug
> completely.
> 
> 
> As I said earlier, this bug will affect any of the users that I have
> previously described, irrespective of whether an updated package makes it into
> lenny - the presence/use of an old version of nvidia-glx will trigger this
> bug. In order to actually fix the bug, nvidia-glx must be upgraded *before*
> libc6 (or any other /etc/ld.so.nohwcap-using package).
> 
> My thoughts on this would be to make affected packages (e.g. libc6) Conflict
> with nvidia-glx (< fixed-version). I'm no expert on how Debian/apt resolves
> dependencies, so I'm not 100% sure whether this will result in:
>  - removal of nvidia-glx;
>  - no upgrade of affected packages;
>  - or upgrade of nvidia-glx before affected packages (the desired result).
> I'm also unsure of the politics of getting the affected packages to make the
> required change, especially considering that they are probably frozen (e.g.
> libc6).

I am currently thinking of other alternative, but I currently can't see
one. If it is the better one, I don't think the freeze will block us
(that is we can convince the release team).

-- 
  .''`.  Aurelien Jarno	            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32 at debian.org         | aurelien at aurel32.net
   `-    people.debian.org/~aurel32 | www.aurel32.net



More information about the Pkg-nvidia-devel mailing list