[Po4a-devel]Some patches for the Man module

Mon, 27 Sep 2004 11:16:39 +0200

--f+W+jCU1fRNres8c
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sun, Sep 26, 2004 at 06:03:54PM +0200, Nicolas Fran=E7ois wrote:
> On Fri, Sep 24, 2004 at 01:28:23AM +0200, Martin Quinson wrote:
> > On Wed, Sep 22, 2004 at 11:19:30PM +0200, Nicolas Fran=E7ois wrote:
>
> Alioth is up and running again. Here is my account: nekral-guest.

Done. You have the commit right. Please use it, but do not abuse it :)
If you want we can stick to the current model for a while, where I review
and commit your patches. But I gave you the cvs commit right so that at
least trivial changes (such as the usage of the testsuite stuff) can go
faster.

> > >   + comments
> > >     It recognize some (probably incorrect, but usual) comment lines.
> > >     Here are the results of the regression tests for this patch:
> >=20
> > Question: the info groff part you cite in comment seems to imply that=
=20
> > .B toto \" a comment
> > is a valid construct, but this is not handled by the code. You should at
> > least add a comment stating so, shouldn't you?=20
>=20
> Yes, it is valid. (At least it is recognized as such by my groff).
>=20
> My problem was: Ok, I recognized a comment inside a paragraph, what
> should I do with this comment ?
> Can I show it in the po (and how?)? Is there any interest in doing this?
> Can I just trash the comments?

I'd say: push it to the po file. The only issue is that it shows a flaw in
po4a. Modules have no way [I could remember of] to push comments into the po
file :-/

Ok, I've added one. It's not tested yet, but if you pass "comment" =3D>
"your_comment" at the end of the translate arguments, it should do the
trick. Now, we should integrate this cleanly into the man module (having a
variable dedicated to the storage of the comments, and used when pushing
translations), but I don't want to interfer with your current changes in
that file.=20

If you test it yourself, the comments should end on lines begining with:
#.=20

Let's keep it on the todo for now.

> > >   + nested_fonts
> > >     It deals with the nested font issue.
> > >     I have an idea on how to simplify it a lot, but I think it could =
be
> > >     applied, because it is doing a good job.
> > >     The only remaining issue is with "un-terminated" fonts, as in:
> > >       Hello, my name is \fINicolas \fBFRAN=C7OIS
> > >     IMHO, in groff, there is no nested font (with some exceptions, li=
ke
> > >     SB, and some italic and bold faces, or by using exotic tmac).
> > >     \fIfoo\fBbar\fR is equivalent to \fIfoo\fR\fBbar\fR (with the
> > >     exception of the \fP).
> >=20
> > Mmm. That's a very tought problem. As you saw in the code, I already tr=
ied
> > to tackle it, in vain so far. I must admit that the regression result y=
ou
> > show are rather impressive.=20
> >=20
> > But I do not like this patch anyway, for several reasons:

If you don't mind I'll let it maturate on your side. I've so little time
myself...=20

> 1) change all font modifiers (e.g. .B, .RI,...) to the corresponding \f
>    I'm thinking of doing this in shiftline (any objection ?), because I
>    need to handle these lines in parse, in the .TP, .SH, and maybe other
>    macro subroutines.

overiding shiftline could be a good idea. You may want to handle some parts
of the chaos in a "upper layer". But please make sure to document this.

> Then in pre_trans:
> 2) split the paragraph on \f
> 3) play with the first letters of each element of this array
>    (they can be [1-9],B,I,R,CW,P,s(-|+)[0-9])
>    - If the first letter is not recognized, then die (or give a \f to the
>      translator)
>    - If two consecutive elements start with the same font, merge them
>=20
>    A special care shall be taken for the first elements (maybe the two fi=
rst,
>    because of \fP) and for the last (maybe the last two) elements.
>    I will probably do this with a global context that will keep the
>    current and previous font (which will be reset by some macros like
>    .SH).
>=20
> I will do some experiments on this.
> I think this will be better, cleaner, smaller.

Sounds very good to me.

> >  - there is one stylistic rule I like in Perl: if you use a given varia=
ble
> >    only once, remove it, you don't need it. @tmp_array1 falls in that
> >    category. Likewise, I'd prefer to write (! $old eq "") as ($old ne "=
")
> >    [please don't get offended by those stylistic remarks, this code is
> >     incredibly hairly, I'd like to keep it under control]
>=20
> I didn't knew ne ;) thanks!
>=20
> How can I remove a variable (or maybe you mean I could have done it
> without using variable with map)?

Simply change something like
 @toto =3D split(/\n/, $toto);
 foreach (@toto) {}
to
 foreach (split(/\n/, $toto)) {}
since @toto is used only once beside its initialization, it's useless. Of
course, some times, you want to keep useless variables to make the code more
readable. I don't felt it was the case here, but express yourself ;)

> >    We should use the x modifier more often to document the REs.
>=20
> Right!
> I'm not sure I can still read my regexp now!

I will try to do so more often, too.

> > All that to say that I'd prefer to keep this one out of the CVS for now.
> > =20
> > >   + arg_next_line
> > >     It allows arguments to be provided on the next line for some macr=
os
> > >     (.SH, .I, ..., .BR, ...)
> > >=20
> > >     It works fine, but would require some cleanup (lots of redundant
> > >     code).
> > Erm. You'll say I'm picky, but I'd prefer you to do this cleanup before
> > commit :)
> I never said the patches were ready for commit;)

Ok, let's leave this one maturate, too.

> > >   + dot_lines
[...]
> > Why don't you change it in paragraphs after the wrap? You'd get 'em all=
, and
> > the code for that would be in only one location.
> >=20
> > For example, you may want to merge your change to the=20
> >       $str =3D~ s/\n([.'"])/ $1/mg; #'
> > just before. What about:
> >       $str =3D~ s/^          # at the begin of line
> >                  (?:\\f.)? # possibly followed by a font modifier
> > 		 ([.'"])   # avoid '.' and '''
> > 	       /\\&$1$2    # add \& in front to fix it
> > 	       /mgx;
>=20
> You're right, doing it in post_trans is better.
> However, there's currently a minor bug with .TP:
[...]
>=20
> So I only propose a degraded mode:
[..]
> This seems to work fine. I will prepare a patch or commit it.

Please do so, this one looks good too.
=20
> > >   + new_macros
>=20
> It should be attached this time.
> Do you think the .so/.mso part is OK ?

Nope, it's not. But you had no way to know it. Btw, I commited all the othe=
rs.

For the sgml module, the policy is to include all the translations in only
one po file, no matter how many source file there is. This is because it's
very difficult to parse a sgml sub-file alone. Where it's included in the
main document is important, as long as the entities defined in the prolog of
the main document, and so on.

So, I'd like to follow the same policy for the man pages. I know that it
will result in dupplicate in the translations, but, well, translators can
use compendiums.

The prefered handling of .so is thus to read the included file, and then
unshift all its lines (begining by the end, of course) into the
transtractor. Of course, this should be a function of the TransTractor
itself such as includefile($). But we didn't agree with Jordi on the
function prototype and syntax before my vacations...

> > >   + escape
> > >     It tries to deal with the \c escape.
> > >     It still need some work.
>=20
> I've done some tests. Overriding pushline is OK. I will try to handle \c
> there.

Go ahead, dude.

> > >   + others
> > >     some other minor points that I could isolate from my working dire=
ctory
> >=20
> > What is the change for cdrdao ? It deals with non escaped leading space=
s, am
> > I right? Does it still deal with not leading spaces? I don't think so. =
you
> > may want to kill those ^=20
>=20
> cdrdao uses:
> .IP CATALOG\ "ddddddddddddd"
> (Here, the quote have to be displayed)
> When this passes through pushmacro, additional quotes shouldn't be added,
> because
> .IP "CATALOG\ "ddddddddddddd""
> (this is understood by groff as an IP request with 2 arguments: 'CATALOG\=
 '
> and 'ddddddddddddd""')

Commited with this explanation.
=20
> The regexp proposed in the patch would be shorter with a negative
> look-behind regexp.

Feel free to shorten it, it it keeps readable (or use /x, or both ;).

> > Ok for the SH macro.
>=20
> Do you think it could be done in the $macro{'SH'} sub ?

I dunno. Now, it's commited as you said, and I dunno what could be the best.
Keeping all the $wrapmode handling at the same location or all the .SH
action at the same place. It's kinda the same.

> > For the space at the begining of paragraph, shouldn't it be handled alo=
ng
> > with ' or . at the beginning of line?

[again ;]

That's the bounced part:

> @@ -986,9 +987,13 @@ $macro{'ps'}=3D\&untranslated;
>  # .so filename Include source file.
>  # .mso groff variant of .so (other search path)
>  $macro{'so'}=3D $macro{'mso'} =3D sub {
> -    die "po4a::man: ".sprintf(
> -      dgettext("po4a","This page includes another file with '%s'. This i=
s not supported yet, but will soon."),
> +    print STDERR "po4a::man: ".sprintf(
> +      dgettext("po4a","This page includes another file with '%s'. Don't =
forget to translate this file, and to make it available at the right place.=
"),
>  	$_[1])."\n",;
> +    my ($self,$macroname,$macroarg)=3D(shift,shift,join(" ",@_));
> +   =20
> +    $self->pushmacro($macroname,
> +		     $self->t($macroarg));
>  };
>  # .sp     Skip one line vertically.
>  # .sp N   Space  vertical distance N

Thanks so much for your work on po4a,
Mt.

--f+W+jCU1fRNres8c
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFBV9p3SJAMsfOxudIRAkhXAJ4hWKYU8BBLui/Cg8sYUFpLy3/+ngCgmSnO
eTwuWf/AtGueUJty+6i+dGc=
=7+gA
-----END PGP SIGNATURE-----

--f+W+jCU1fRNres8c--