[Po4a-devel][RFC] Multi-lines verbatim blocks

Denis Barbier barbier@linuxfr.org
Wed, 17 Nov 2004 01:17:12 +0100


On Mon, Nov 15, 2004 at 11:55:31PM +0100, Nicolas Fran=E7ois wrote:
> Hello,
>=20
> To solve an issue with the man module, I've implemented a way to specif=
y
> some (multi-lines) verbatim blocks.

I did not understand your message at first reading because I thought
that 'verbatim blocks' were unformatted, maybe you should talk about
'untranslated blocks' instead.

> I'm wondering if this functionality has an interest for po4a users (it
> may be error prone), and if it can be useful for other modules (in this
> case it should be implemented in Transtractor).

SGML prologs seem to be quite similar, and almost all formats can define
header comments.  So if I understand you right, it seems to be quite
generic.

> The major cause (50%) of failure of the man module (po4a exits with an
> error message) concerns blocks in roff language (using .de, .if, .ie or
> .el requests).
> They are usually used to define new macros or different ways to do the
> same thing depending on the parser (e.g. nroff or troff).
>=20
>=20
> Copying these block of code from the original to the translation is
> particularly useful for the man module because the roff language is qui=
te
> complicated, which pushed authors to just cut & past definitions found =
in
> other pages (a lot of the defined macros are not even used).

Agreed.

> With my first try, I could correctly process 60 additional files (witho=
ut
> even defining additional macros). 25 different blocks were used (each o=
ne
> defined in a file).
>=20
>=20
> Everything is not so neat:
>   - if the block appear in the middle of a paragraph, the block will be
>     copied verbatim at a wrong place: at the beginning of the paragraph
>     (this could be fixed by adding a function to "flush" the parser).
>     Most of the time, it is not a concern because the macros are define=
d
>     in the header and it is up to the user to specify or not a verbatim
>     block, but it should probably be fixed.

IMO such blocks should split paragraphs into 3 parts: before, block
itself, after.

>   - it makes po4a slower (I don't think it's an issue, except for the
>     testsuite;): each lines of the input document has to be compared wi=
th
>     a line of each specified files (and lines have to be shifted/unshif=
ted
>     many times).

Do not worry, I have some ideas to optimize regexes.  Do you have some
large file which could be useful for benchmarking?

>   - the user can do whatever he wants and it is, as the addendums, a qu=
ite
>     complicated feature which can be error prone.

I do not follow you here, my understanding was that blocks were copied
from original file, so translators have almost no control here.

> Do you think it may be worth having such a mechanism in po4a?
> Is there an interest for the other modules?

Sounds like a very good idea, but maybe you should first explain in more
details how you want to proceed.

> Note:
> For the man module, I'm also planing to add a verbatim and translate
> option (like in the sgml module), which should allow to specify the
> behaviour of the parser for these additional macros.
>=20
> Another Note:
> I've not tried this, but it may solve #263298 (Please let -gettextize k=
now
> about addendums and remove them automatically).

This would be really cool, I am converting man pages of manpages-fr by ha=
nd,
and it is quite boring ;)

Denis