[xml/sgml-pkgs] Bug#763598: docbook-xml: xmllint fails to identify local copy of docbook entities file

Raphael Hertzog hertzog at debian.org
Wed Oct 1 07:58:01 UTC 2014


Package: docbook-xml
Version: 4.5-7.2
Severity: important

Consider the test document attached, it's starting with this:

<?xml version='1.0' encoding='utf-8' ?>
<!DOCTYPE section [
<!ENTITY % BOOK_ENTITIES SYSTEM "Users_Guide.ent">
%BOOK_ENTITIES;
<!ENTITY % sgml.features "IGNORE">
<!ENTITY % xml.features "INCLUDE">
<!ENTITY % DOCBOOK_ENTS PUBLIC "-//OASIS//ENTITIES DocBook Character Entities V4.5//EN" "http://www.oasi
s-open.org/docbook/xml/4.5/dbcentx.mod">
%DOCBOOK_ENTS;
]>

Now I want to parse it (with publican which uses libxml internally) but I always ends
up loading http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod from the network instead
of finding the local copy. I can reproduce the problem with xmllint:

    $ XML_DEBUG_CATALOG=1 xmllint --debugent --nonet --noent --noout test.xml
    [...]
    Resolve: pubID -//OASIS//ENTITIES DocBook Character Entities V4.5//EN sysID http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod
    0 Parsing catalog file:///etc/xml/catalog
    file:///etc/xml/catalog added to file hash
    file:///etc/xml/docbook-xml.xml not found in file hash
    0 Parsing catalog file:///etc/xml/docbook-xml.xml
    file:///etc/xml/docbook-xml.xml added to file hash
    Trying system delegate file:///etc/xml/docbook-xml.xml
    Resolve URI http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod
    I/O error : Attempt to load network entity http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod
    [...]

This is not normal. It looks like only the system idendifier (the URL) is used
while the public identifier (for which there's a match in /etc/xml/docbook-xml.xml)
is not used:

    $ grep -- "-//OASIS//ENTITIES DocBook Character Entities V4.5//EN" /etc/xml/docbook-xml.xml
    <delegatePublic publicIdStartString="-//OASIS//ENTITIES DocBook Character Entities V4.5//EN" catalog="file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml"/>

To confirm this impression I modified /etc/xml/docbook-xml.xml to replace this line:

    <delegateSystem systemIdStartString="http://docbook.org/xml/4.5/docbookx.dtd" catalog="file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml"/>

With this one:

    <delegateSystem systemIdStartString="http://docbook.org/xml/4.5/" catalog="file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml"/>

This allowed to go one step further in the catalog lookup:

    Resolve: pubID -//OASIS//ENTITIES DocBook Character Entities V4.5//EN sysID http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod
    0 Parsing catalog file:///etc/xml/catalog
    file:///etc/xml/catalog added to file hash
    file:///etc/xml/docbook-xml.xml not found in file hash
    0 Parsing catalog file:///etc/xml/docbook-xml.xml
    file:///etc/xml/docbook-xml.xml added to file hash
    Trying system delegate file:///etc/xml/docbook-xml.xml
    file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml not found in file hash
    0 Parsing catalog file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml
    file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml added to file hash
    Trying system delegate file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml
    Resolve URI http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod
    I/O error : Attempt to load network entity http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod

And to finally get it to work, I had to add this line in
/usr/share/xml/docbook/schema/dtd/4.5/catalog.xml:

    <system systemId="http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod"
	    uri="dbcentx.mod"/>

Now I have this:

    Resolve: pubID -//OASIS//ENTITIES DocBook Character Entities V4.5//EN sysID http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod
    0 Parsing catalog file:///etc/xml/catalog
    file:///etc/xml/catalog added to file hash
    file:///etc/xml/docbook-xml.xml not found in file hash
    0 Parsing catalog file:///etc/xml/docbook-xml.xml
    file:///etc/xml/docbook-xml.xml added to file hash
    Trying system delegate file:///etc/xml/docbook-xml.xml
    file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml not found in file hash
    0 Parsing catalog file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml
    file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml added to file hash
    Trying system delegate file:///usr/share/xml/docbook/schema/dtd/4.5/catalog.xml
    Found system match http://www.oasis-open.org/docbook/xml/4.5/dbcentx.mod, using file:///usr/share/xml/docbook/schema/dtd/4.5/dbcentx.mod
    new input from file: file:///usr/share/xml/docbook/schema/dtd/4.5/dbcentx.mod

There's something fishy either in the catalog files, or in the logic of libxml2, I'm not
sure which one. Looking at
https://www.oasis-open.org/committees/download.php/14809/xml-catalogs.html#s.ext.resx
it looks like that the catalog file is at fault since libxml2 does the
right thing by trying to use the system identifier in the first place.

FWIW, I investigated this with the upstream author of Publican in this bugzilla ticket:
https://bugzilla.redhat.com/show_bug.cgi?id=1143060#c18

It's really blocking me to release the new version of Publican in Debian so it would
be nice to find a fix quickly if possible, because the freeze is approaching.

-- System Information:
Debian Release: jessie/sid
  APT prefers squeeze-lts
  APT policy: (500, 'squeeze-lts'), (500, 'unstable'), (500, 'testing'), (500, 'stable'), (500, 'oldstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 3.16-2-amd64 (SMP w/4 CPU cores)
Locale: LANG=fr_FR.utf8, LC_CTYPE=fr_FR.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages docbook-xml depends on:
ii  sgml-base  1.26+nmu4
ii  sgml-data  2.0.9-1
ii  xml-core   0.13+nmu2

docbook-xml recommends no packages.

Versions of packages docbook-xml suggests:
ii  docbook           4.5-5.1
pn  docbook-defguide  <none>
ii  docbook-dsssl     1.79-7
ii  docbook-xsl       1.78.1+dfsg-1

-- no debconf information

-- debsums errors found:
debsums: changed file /usr/share/xml/docbook/schema/dtd/4.5/catalog.xml (from docbook-xml package)

-- 
Raphaël Hertzog ◈ Debian Developer

Discover the Debian Administrator's Handbook:
→ http://debian-handbook.info/get/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.xml
Type: application/xml
Size: 11096 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/debian-xml-sgml-pkgs/attachments/20141001/139105e2/attachment.xml>


More information about the debian-xml-sgml-pkgs mailing list