[Po4a-devel] XML <? blabla ?> constructs

Martin Quinson martin.quinson@ens-lyon.fr
Fri, 16 Jul 2004 09:16:20 +0200 (CEST)


Quoting Michael Wiedmann <mw@miwie.in-berlin.de>:

> * Martin Quinson <mquinson@ens-lyon.fr> wrote [040715 22:05]:
> 
> > How should I support it? Is it reasonnable to do the same trick than
> for
> > CDATA (which is not implemented yet either)? Can I just ignore this
> tag, and
> > present it to the user as a string to translate, or is there any
> cleaver
> > action?
> 
> IMHO PI's should never be translated because the just contain a kind
> of hint for the processing command.
> 
> The best thing is to just ignore every PI.

The point is that the current design of the Sgml module makes it very 
difficult to simply ignore some parts of the document because I use an 
external parser. I can change them into text before launching the parser, and 
then turn they back to their values. That's the trick I use to protect the 
entities, for example. &version; is rewritten to PO4A-ampversion; before 
launching nsgml, and then back while generating the document/po files. But if 
I do so, I'm afraid that nsgml begins whining about CDATA being placed where 
it shouldn't.

I am near the point where I decide that nsgml creates more problem than it 
solves. Making a ?ML parser in perl from the scratch shouldn't be that 
difficult after all.

The only reasons why I don't go further and reimplement my own parser is the 
complexity of the code. As I said recently, I don't even remember the 
differences between translate and indent (I guess that's a good argument to 
reingeenering the code ;). And there is some weird parts to support sgml 
specificities such as conditional compilation (ok, that would be also easier 
without nsgml). There is the file inclusion mecanism, also (but it should be 
generalized, and put in the core of po4a so that other module can use it).

Ok, the real reason is my chronical lack of time, and the fear of introducing 
new bugs which I would have to fix ASAP since some people begin to use po4a...

I'm gonna try masking the <?bla?> as well as CDATA to nsgml, to see if it does 
the trick.

Mt.