[debiandoc-sgml-pkgs] Bug#397676: debiandoc-sgml: Creates unprocessable LaTeX code for CJK languages

Frank Küster frank at debian.org
Wed Nov 8 20:18:46 CET 2006


Package: debiandoc-sgml
Version: 1.1.94+1.1.95bpo1
Severity: serious

This has already been reported as #397571 (but this bug also has a
second aspect that has no relationship to debiandoc-sgml, therefore not
reassigning) and against debian-zh-faq (I think, no bugnumber yet, and
I've touched lots of packages), and now it shows up in exactly the same
way in maint-guide.  debiandoc2latexpdf or debiandoc2latexps create
LaTeX code that does not "compile" with latex.  For maint-guide, the
error message is:

This is pdfeTeX, Version 3.141592-1.21a-2.2 (Web2C 7.5.4)
entering extended mode
(./maint-guide.ja.tex
LaTeX2e <2003/12/01>
[...]
(/usr/share/texmf/tex/latex/CJK/standard.bdg)
(/usr/share/texmf/tex/latex/CJK/standard.enc)
(/usr/share/texmf/tex/latex/CJK/standard.chr)
! Undefined control sequence.
try at size@range ...extract at rangefontinfo font at info 
                                                  <-*>@nil <@nnil 
l.98 {\Huge Debian ¿·
                     ¥á¥ó¥Æ¥Ê¥¬¥€¥É} \\[2ex]
! Undefined control sequence.
<argument> @nil 
                
l.98 {\Huge Debian ¿·
                     ¥á¥ó¥Æ¥Ê¥¬¥€¥É} \\[2ex]
! Argument of extract at rangefontinfo has an extra }.
[...]

In the other packages it happened with zh-tw (taiwanese chinese,
methinks?) 

Danai who's in Debbugs-Cc has some knowledge of the problems.  For
reference, I attach here what he wrote in the bug mentioned above:

********************************
Nope, reference.zh-tw.tex-in doesn't use UTF-8, it's written in Big5.
reference.zh-tw.tex itself is made with "bg5conv", and converts all
double-byte glyphs into TeX-readable strings; you can't read the
output with a human-readable encoding.

This step isn't necessary (anymore): just move reference.zh-tw.tex-in
to reference.zh-tw.tex and process it with "latex" as you would with
any other document.

And use "bkai" instead of "kai"; "kai" is reserved for ugly HBF bitmap
fonts, and I haven't packaged them yet (I find them of little use,
except perhaps the CNS fonts).

The author or the SGML processor seem(s) to use \textasciicircum et
alii commands a lot; in my DVI it shows as plain "textasciicircum".
The reason that it shows the plain name instead of actually showing
the intended sign is because of the encoding.
When viewing in the C locale, textbackslash{} is shown as
\textbackslash{}.  However, if you look at the file in Big5, you will
see that the backslash is actually part of the prior character.
You might need to put an "extra" backslash before commands such as
\textbackslash or \textasciicircum.

I played with search-and-replace a bit, and I think that it looks now
the way it should.  You'll find a copy of it in attachment.
Just run it with "latex" a few times.  That should be enough.

You can also use the UTF-8 encoding along with the CJKutf8 package.
That might be a bit more straightforward if you use a preprocessor.

I must say that the DebianDoc SGML DTD produces pretty ugly (Chinese)
TeX output.  Any change of making enhancements?  E.g. when switching
from Chinese to Western script, it is recommended that you use a tilde
(~) to add some extra space; CJK provides a command called \CJKtilde
to automatically make every tilde a CJKtilde.  Normal tildes (like in
an URI addres to a homepage) you can use \standardtilde.
********************************

Regards, Frank


-- System Information:
Debian Release: 3.1
  APT prefers unstable
  APT policy: (99, 'unstable')
Architecture: i386 (i686)
Kernel: Linux 2.6.17-2-686
Locale: LANG=de_DE at euro, LC_CTYPE=de_DE at euro (charmap=ISO-8859-15)

Versions of packages debiandoc-sgml depends on:
ii  libhtml-parser-perl       3.45-2         A collection of modules that parse
ii  libroman-perl             1.1-19         Perl module for converting between
ii  libsgmls-perl             1.03ii-31      Perl modules for processing SGML p
ii  libtext-format-perl       0.52-19        Perl module for formatting (text) 
ii  liburi-perl               1.35-1         Manipulates and accesses URI strin
ii  perl                      5.8.4-8sarge5  Larry Wall's Practical Extraction 
ii  perl-modules [libi18n-lan 5.8.4-8sarge5  Core Perl modules
ii  sgml-base                 1.26           SGML infrastructure and SGML catal
ii  sgml-data                 2.0.3          common SGML and XML data
ii  sgmlspl                   1.03ii-31      SGMLS-based example Perl script fo
ii  sp                        1.3.4-1.2.1-43 James Clark's SGML parsing tools

-- no debconf information

-- 
Dr. Frank Küster
Single Molecule Spectroscopy, Protein Folding @ Inst. f. Biochemie, Univ. Zürich
Debian Developer (teTeX/TeXLive)




More information about the Debiandoc-sgml-pkgs mailing list