[Po4a-devel] XML <? blabla ?> constructs

Jordi Vilalta jvprat@wanadoo.es
Fri, 16 Jul 2004 10:21:46 +0200 (CEST)


Hi,

On Fri, 16 Jul 2004, Martin Quinson wrote:
> ...
>
> The point is that the current design of the Sgml module makes it very 
> difficult to simply ignore some parts of the document because I use an 
> external parser. I can change them into text before launching the parser, and 
> then turn they back to their values. That's the trick I use to protect the 
> entities, for example. &version; is rewritten to PO4A-ampversion; before 
> launching nsgml, and then back while generating the document/po files. But if 
> I do so, I'm afraid that nsgml begins whining about CDATA being placed where 
> it shouldn't.
> 
> I am near the point where I decide that nsgml creates more problem than it 
> solves. Making a ?ML parser in perl from the scratch shouldn't be that 
> difficult after all.

My current implementation of the XML parser is quite generic and 
customizable (I think). Maybe it could replace the current sgml one (when 
it gets mature).

> 
> The only reasons why I don't go further and reimplement my own parser is the 
> complexity of the code. As I said recently, I don't even remember the 
> differences between translate and indent (I guess that's a good argument to 
> reingeenering the code ;). And there is some weird parts to support sgml 
> specificities such as conditional compilation (ok, that would be also easier 
> without nsgml). There is the file inclusion mecanism, also (but it should be 
> generalized, and put in the core of po4a so that other module can use it).

I think that this issue about the file inclusion, and the file encodings 
are two of the main lacks of the po4a core. The rest are only format 
modules, and these can always be extended.

> 
> Ok, the real reason is my chronical lack of time, and the fear of introducing 
> new bugs which I would have to fix ASAP since some people begin to use po4a...

It's open source, there are lots of hands out there, that can help ;)

> 
> I'm gonna try masking the <?bla?> as well as CDATA to nsgml, to see if it does 
> the trick.

Regards,

Jordi Vilalta