[Po4a-devel] XML <? blabla ?> constructs

Jordi Vilalta jvprat@wanadoo.es
Fri, 16 Jul 2004 23:15:35 +0200 (CEST)


Hi,

On Fri, 16 Jul 2004, Martin Quinson wrote:
> On Fri, Jul 16, 2004 at 10:21:46AM +0200, Jordi Vilalta wrote:
> > Hi,
> > 
> > On Fri, 16 Jul 2004, Martin Quinson wrote:
> > > I am near the point where I decide that nsgml creates more problem than it 
> > > solves. Making a ?ML parser in perl from the scratch shouldn't be that 
> > > difficult after all.
> > 
> > My current implementation of the XML parser is quite generic and 
> > customizable (I think). Maybe it could replace the current sgml one (when 
> > it gets mature).
> 
> It's maybe time that:
>  - you create a user on alioth, so that we can give you the cvs write access

Thanks for the offer :) I already have the "jvprat-guest" user.

>  - you show us the code of this xml module

I'll try to send something tomorrow. At this moment it's broken and does 
nothing.

> ?
> 
> > > The only reasons why I don't go further and reimplement my own parser is the 
> > > complexity of the code. As I said recently, I don't even remember the 
> > > differences between translate and indent (I guess that's a good argument to 
> > > reingeenering the code ;). And there is some weird parts to support sgml 
> > > specificities such as conditional compilation (ok, that would be also easier 
> > > without nsgml). There is the file inclusion mecanism, also (but it should be 
> > > generalized, and put in the core of po4a so that other module can use it).
> > 
> > I think that this issue about the file inclusion, and the file encodings 
> > are two of the main lacks of the po4a core. The rest are only format 
> > modules, and these can always be extended.
> 
> Yes, the more urgent is to move out the file inclusion mecanism to
> TransTractor.pm I may try to work on this next week, but I'm completely
> overhelmed currently. If you want to work on this, please mail the list
> before so that we make sure we don't dupplicate the effort, and please
> process.
> 
> The problem is that detecting a file inclusion in the source is format
> dependant. In sgml, that's done with entities while the macro so is used in
> groff.
> 
> A solution may be to do something like that at the begining of each module:
> 
> map { if (/^.so (.*)/) $doc->includefile($1) } $doc->{TT}{doc_in};
> 
> It is a very crude proof of concept, though. The right solution may even be
> completely different ;)

At the moment I prefer staying out of the core modules.

I'm still so new to perl, and I get stuck with sentences like the above 
one. I've been thinking about the concept, and I think that there are 3 
interesting cases on file inclusion:
1) include the file at the beginning of the input stream 
2) include the file at the end of the input stream
3) get the file in a separate list, to be able to parse it alone

Cases 2 and 3 can be reduced to case 1 with a little work from the module 
developer, saving the names of the files to include in a list, and 
including them when he gets to the end of the main file.

This should be done as a single function call, for example:
  $self->include_file("file");

And it would include this file at the beginning of the input (case 1)

Well, I only see it from the document modules point of view (the one I 
know), but I think that case 1 should be easy to implement.

> 
> For the encoding issues, it's even worse. I don't have even the begining of
> an answer. We may have to wait until Denis comes back.

I have a vague idea about it. I'll think about it in the next days.

Regards,

Jordi Vilalta