Bug#389065: [debiandoc-sgml-pkgs] Bug#389065: debiandoc-sgml: [INTL:ru] Russian UTF-8 locale

Jens Seidel jensseidel at users.sf.net
Sun Nov 12 23:02:15 UTC 2006


On Sun, Sep 24, 2006 at 07:49:04PM +0900, Osamu Aoki wrote:
> On Sat, Sep 23, 2006 at 09:06:20PM +0200, Jens Seidel wrote:
> > yep, I agree that UTF-8 should be supported for a wider range of
> 
> In theory, it is a good direction.
> 
> Changing encoding to UTF-8 should be simple recoding for HTML and plain
> text but... PS and PDF are real work since tool chain (LaTeX) seems to
> be using good old local encoding.  If you can address both format,
> please propose fix.  If particular encoding does not mind breaking
> PS/PDF build script, they can change to UTF-8 now.

Russian, Ukrainian, Vietnamese and maybe other languages work well in
UTF-8 even for PS and PDF. (I noticed that the dash in PDF files looks a
little bit strange (thick but short) but that's not very important.)

I tested Russian for APT HOWTO and Debian Reference without problems.
Since Etch will be more UTF-8 centric, an UTF-8 default would be useful,
right?

There are a few problems related to this:
 All packages containing Russian documents would FTBFS. A simple recoding
 of the document (and for Debian Reference also of the *.ent file and
 bin/getdocdate) needs to be done, but only in the package, not in DDP
 CVS since the build host still runs Sarge. On the other side the DDP
 build can be deactivated until Etch.

 It's also possible to use ru_RU.KOI8-R as locale in the build script.
 But this would create filenames <document>.ru_RU.KOI8-R.html instead of
 <document>.ru.html (if option -c of debiandoc2html is used).

> But, ... I think Japanese,  Chinese, ... possibly Russian may need good
> work.  I really do not have time to do it.  Practical solution is to
> make behavior work with both style.  Any well thought action are
> welcomed

I suggest we still wait a few days to test Asian together with UTF-8 as
proposed in the RC bug report. I will test these too.

But we should first decide whether we all agree that UTF-8 would be a
good idea. After this I could also post to debian-doc and provide help.
(Documentation packages could break the freeze to update translations,
so *now* is a good change to switch to UTF-8. After the Etch release it
would be harder.)

Jens




More information about the Debiandoc-sgml-pkgs mailing list