[Po4a-devel]Xml.pm

Jordi Vilalta jvprat@wanadoo.es
Mon, 19 Jul 2004 15:22:02 +0200 (CEST)


On Sun, 18 Jul 2004, Martin Quinson wrote:
[...]
> Sounds interesting. Some comments:
>  - the wrap option must not be a module option, but should differ depending
>    on the tag. Proposition:
> 		<abbrev>
> 		W<acronym>
> 		W<arg>
> 		<artheader>
> 		<attribution>
> 		<date>
>    with 'w' meaning wrap (by default) and 'W' meaning don't wrap.

Interesting notation, I'll try to implement it. But I think that there 
should be the module option to select the default behavior, maybe not from 
command line, but for derivate modules.

>  - the error messages should use dgettext to get translated. The reason why
>    it does not use gettext just as the scripts is that one day, maybe,
>    someone else than us will want to use the libraries (Po.pm, for example).
>    That day, we should search our translation in our own domain, not the one
>    of the guys using our code.

Oops, thanks for the explanation. I'll change it now.

>  - you should die on error, not return an invalid string (get_structure())

I don't know if it's really an error. In any case it could be a document 
error. Maybe it would be useful to parse some kinds of strange DTDs that 
accept text outside the root tag (I think it goes against the XML concept, 
but you never know...).

>  - get_structure may be renamed get_path, no?

Yes, it's clearer. I failed to find good names for lots of things :(

>  - don't comment the debuging statement, Use the debug option value to
>    determine whether to print them or not

I had in mind to do it in the future. The current debugging messages 
are only helpful for building the parser.

>  - does get_string_until() really return the references (as the doc says)?
>    Wouldn't be easier to return the reference to the first/last line? It
>    depends on whether the module implementer may want to use only parts of it.

At the moment, all the middle references aren't used, but they will be to 
implement the attributes handling (for example), to return the exact line 
where the attribute appears (in case of tags splitted in many lines).

>  - s/toto/tutu/i is case insensitive. I dunno whether it help you, since I
>    didn't look closely enough to get where the tag detection is made, but
>    the 'i' char after the patern may be want you want.
>    I thought that XML *must* be case insensitive (wrt the tag names, at
>    least). 

I think XML is case sensitive but SGML, for example, is not. There will be 
a caseinsensitive option to handle these documents, but it isn't 
implemented yet (ie, now everything is case sensitive)

> > I attach an adaptation of the Dia module and a sample Docbook module that 
> > work both with the Xml module. The new Dia module is fully functional (I 
> > like those really simple formats ;), but the Docbook one is only for 
> > testing. At least those are examples of how simple the derived modules can 
> > be.
> 
> If Dia2.pm is as good as Dia.pm. I guess that you should put it into the CVS
> (erasing Dia.pm), and modify the MANIFEST to get Xml distributed. Xml
> shouldn't be used directly, as you point out in the documentation, but
> that's not an issue.

I forgot that the dia diagrams use UTF-8, and Dia2 with the Xml module 
doesn't recode yet. It will have to wait.

> We may want to put the formats in another directory, maybe. So that
> po4a-translate -t po blabla fails. For now, I guess that Chooser will try to
> load Po.pm as a format...

It would be nice.

Regards,

Jordi Vilalta