Bug#402118: [debiandoc-sgml-pkgs] Bug#402118: debiandoc-sgml: Please allow alternative dependency on texlive

Danai SAE-HAN (=E9=9F=93=E9=81=94=E8=80=90) danai.sae-han at edpnet.be
Sun Jan 7 03:28:23 CET 2007


From: Osamu Aoki <osamu at debian.org>

> On Sat, Jan 06, 2007 at 06:44:03PM +0100, Danai SAE-HAN  wrote:
> > 
> > I also think that latex-cjk-all (+fonts) should be enough as a
> > Build-Depends for debian-reference, since latex-cjk-common already
> > depends on tetex or texlive.
> > 
> > But you might precipitate the move to TeXlive as default TeX
> > distribution by putting texlive|tetex (etc.) in the Build-Depends.
> > We all have to move to TeXlive now that teTeX is not actively
> > developed anymore, so changing to TeXlive might be a good exercise
> > now.  I myself will wait for the transition of latex-cjk to default to
> > the TeXlive distribution after my exams in February.
> 
> First thing to wait is etch release, I think.  Then we can make large
> change such as switching to texlive.

OK.

> > For debiandoc-sgml, I think latex-cjk-all is enough as a Recommends,
> > because latex-cjk-{chinese,japanese} recommends already some fonts.
> > No need to make it any more complex than it is. ;)
> 
> After etch release and when you have time, can you look into font choice
> of debiandopc-sgml tool chain (not just for CJK but all other languages)
> and add explanation text to font package choice determination scheme
> (where to look for and what to chose.) in generic way so ther document
> will be valid for future release too.
> 
>  /usr/share/perl5/DebianDoc_SGML/Locale/xx_YY.ZZZZ/LaTeX
> 
> This has language specific definitions
> 
> As I review these, it looks like we need to have 2 entries xx_YY.ZZZZ
> for each language so we can handle both UTF-8 and traditional encodings.
> Hmmmm...  that calls to add new feature to debiandoc-sgml.  Maybe -l
> locale is not ending with utf-8 then we use traditional encoding but if
> they are ending with traditional encoding, we should use utf-t counter
> part.  That will be major rewite of scripts.
> 
> If we use current build process, UTF-8 encoded latex source needs to be
> converted to traditional encoding.  (Can be done by iconv running at -s
> option script fixlatex.)
> 
> But that sounds too complicated.  Does TeXlive has capability to handle
> UTF-8 encoded source?

TeXlive's UTF-8 support is just like teTeX's: it all depends on the
TeX flavour you're using, and the most popular (TeX and LaTeX) have
poor support for multi-byte strings.

And using iconv is tricky, because sometimes iconv hangs with some
Chinese texts, for example the original tang300 fortune cookie in
simplified Chinese (Debian now offers a proper UTF-8 version).

There are packages like utf8x that are under development, but IMHO
that just takes too much time and we need a quick fix.  UTF-8 has been
around for many years now.

Luckily, CJK offers IMHO superior results the easy way, supporting
glyphs up to u+10FFFF, which is enough for most of us, simple mortals.
Just set up a font with a font definition file and correct map files,
and off we go.  I could make some sort of a template file of my own
debian/rules file, and perhaps document it a little bit more.

Two things are needed though: an updated Unicode.sfd file in
freetype1-tools (I just filed a bug report), and a DFSG-free font that
covers most scripts.  There is Bitstream's Cyberbit TTF, but that
isn't DFSG-free.

Titus is freely downloadable for individuals, but isn't DFSG-free
either.  And it mainly covers Western scripts.

Arne Götje is working on two unified Chinese TTF that includes CJK
Extension B (and perhaps C), but Arne told me that he prefers to wait,
because some structural problems have occurred when developing.

What we could do, is to make Type1 fonts for each script.  So we
convert fonts for Cyrillic, IPA, Greek, etc., and make a latex-cjk-*
package for each font.

Advantage is that it works 99% (I've already tried for Georgian,
Russian and CJK Extension B), and it's relatively easy to make such
new fonts.  And if we can put a few heads together, we could extend
the sfd files in /usr/share/texmf/fonts/sfd/ [freetype1-tools] to
include other scripts, so we get not only Unicode fonts but also
linked fonts for the "ancient" encodings.

Another advantage is that with this system, there will be a unified
structure for each language.

Disadvantages are firstly that it would create quite a few extra MB
for the Debian servers.  And secondly, the CJK package was never
intended to be used apart from Chinese, Japanese, Korean, Thai or
Vietnamese, so language-specific problems (like complex ligatures in
Indic and Arabic languages) are not (yet?) supported.  And thirdly,
authors need to show some goodwill to switch to the new system.  But I
regard the last one as a minor problem, since the only change will be
the names of the fonts they need to know.

I'll have more time next month, in February, when my exams will be
over, to thoroughly check debiandoc-sgml about font issues.


Cheers



Danai SAE-HAN
韓達耐

-- 
題目:《和子由澠池懷舊》
作者:蘇軾(1036-1101)

人生到處知何似,應似飛鴻踏雪泥:
泥上偶然留指爪,鴻飛那復計東西。
老僧已死成新塔,壞壁無由見舊題。
往日崎嶇還記否,路上人困蹇驢嘶。




More information about the Debiandoc-sgml-pkgs mailing list