[Po4a-devel] Problems with Docbook/Xml module

Nicolas François nicolas.francois at centraliens.net
Sat Nov 12 13:49:19 UTC 2005


Hello,

On Sun, Oct 23, 2005 at 04:50:55AM +0200, jvprat at gmail.com wrote:
> Hi!
> 
> 2005/10/22, Nicolas François:
> > It's more complicated than expected: The problem is not with the <footnote>
> > tag (which is already in the inline category). The problem is caused by
> > the <para> tag, which impose the break in the footnote.
> >
> > I don't know what can be done. Don't expect it in the next release.
> 
> IIRC, the XML module is ready to handle this. You can add
> "<footnote><para>" to the inline option. It's a documented but
> currently unused feature. I hope it works! :D

This is not possible at this time.
When an inline tag is found, it is not added to the tags path.
Thus, in:
<para>
  foo
  <footnote>
    <para>
      bar
    </para>
  </footnote>
</para>

The tags path for "bar" is <para><para>, so <footnote><para> never match.


I attach a patch that should fix this.


I used it for the aptitude man page. I had to remove the <arg> and
<replaceable> tags from the Docbook module [0].
Then I used the following options:
-o tags="<arg><replaceable> <term><replaceable>"
-o inline="<arg> <literal><replaceable> <para><replaceable> <filename><replaceable>"

This permitted do reduce the number of strings from 186 to 130 on the
aptitude man page (the aptitude commands and options are no more
translatable, and some strings are grouped and provide more context).

(The choice of the above options are quite empirical and are probably only
valid for this aptitude man page. I don't plan to add these options to the
default tags in the Docbook module. This is just to show that it permits
to do what I want.)

[0] This is IMO another issue: it is not possible to remove a tag from the
    default list.  It could be better, when a tag is provided in an
    option, to first remove it from the default categories.  For example,
    putting <arg> in the inline category (-o inline="<arg>") is not
    perfect because <arg> will also stay in the 'tags' category.
    I will submit a patch for this later.

Cheers,
-- 
Nekral
-------------- next part --------------
Index: lib/Locale/Po4a/Docbook.pm
===================================================================
RCS file: /cvsroot/po4a/po4a/lib/Locale/Po4a/Docbook.pm,v
retrieving revision 1.7
diff -u -r1.7 Docbook.pm
--- lib/Locale/Po4a/Docbook.pm	21 Oct 2005 19:10:29 -0000	1.7
+++ lib/Locale/Po4a/Docbook.pm	12 Nov 2005 10:12:06 -0000
@@ -91,7 +91,6 @@
 	$self->{options}{'tags'}.='
 		<abbrev>
 		<acronym>
-		<arg>
 		<artheader>
 		<attribution>
 		<date>
@@ -207,7 +206,6 @@
 		<property>
 		<quote>
 		<remark>
-		<replaceable>
 		<returnvalue>
 		<revhistory>
 		<sgmltag>
Index: lib/Locale/Po4a/Xml.pm
===================================================================
RCS file: /cvsroot/po4a/po4a/lib/Locale/Po4a/Xml.pm,v
retrieving revision 1.30
diff -u -r1.30 Xml.pm
--- lib/Locale/Po4a/Xml.pm	26 Oct 2005 13:56:18 -0000	1.30
+++ lib/Locale/Po4a/Xml.pm	12 Nov 2005 10:12:08 -0000
@@ -826,9 +826,25 @@
 sub treat_content {
 	my $self = shift;
 	my $blank="";
+	# Indicates if the paragraph will have to be translated
+	my $translate = 0;
+
 	my ($eof, at paragraph)=$self->get_string_until('<',{remove=>1});
 
+	# Check if this has to be translated
+	if ($self->join_lines(@paragraph) !~ /^\s*$/s) {
+		my $struc = $self->get_path;
+		my $inlist = 0;
+		if ($self->tag_in_list($struc,@{$self->{tags}})) {
+			$inlist = 1;
+		}
+		if ($self->{options}{'tagsonly'} eq $inlist) {
+			$translate = 1;
+		}
+	}
+
 	while (!$eof and !$self->breaking_tag) {
+	NEXT_TAG:
 		my @text;
 		my $type = $self->tag_type;
 		my $f_extract = $tag_types[$type]->{'f_extract'};
@@ -837,19 +853,67 @@
 			# Remove the content of the comments
 			($eof, @text) = $self->extract_tag($type,1);
 		} else {
+			my ($tmpeof, @tag) = $self->extract_tag($type,0);
 			# Append the found inline tag
 			($eof, at text)=$self->get_string_until('>',
 			                                     {include=>1,
 			                                      remove=>1,
 			                                      unquoted=>1});
+			# Append or remove the opening/closing tag from
+			# the tag path
+			if ($tag_types[$type]->{'end'} eq "") {
+				if ($tag_types[$type]->{'beginning'} eq "") {
+					push @path, $self->get_tag_name(@tag);
+				} elsif ($tag_types[$type]->{'beginning'} eq "/") {
+					my $test = pop @path;
+					if (!defined($test) ||
+					    $test ne $tag[0] ) {
+						die wrap_ref_mod($tag[1], "po4a::xml", dgettext("po4a", "Unexpected closing tag </%s> found. The main document may be wrong."), $tag[0]);
+					}
+				}
+			}
 			push @paragraph, @text;
 		}
 
 		# Next tag
 		($eof, at text)=$self->get_string_until('<',{remove=>1});
 		if ($#text > 0) {
+			# Check if text (extracted after the inline tag)
+			# has to be translated
+			if ($self->join_lines(@text) !~ /^\s*$/s) {
+				my $struc = $self->get_path;
+				my $inlist = 0;
+				if ($self->tag_in_list($struc,
+				                       @{$self->{tags}})) {
+					$inlist = 1;
+				}
+				if ($self->{options}{'tagsonly'} eq $inlist) {
+					$translate = 1;
+				}
+			}
 			push @paragraph, @text;
 		}
+
+		# If the next tag closes the last inline tag, we loop again
+		# (In the case of <foo><bar> being the inline tag, we can't
+		# loop back with the "while" because breaking_tag will check
+		# for <foo><bar><bar>, hence the goto)
+		$type = $self->tag_type;
+		if (    ($tag_types[$type]->{'end'} eq "")
+		    and ($tag_types[$type]->{'beginning'} eq "/") ) {
+			my ($tmpeof, @tag) = $self->extract_tag($type,0);
+			if ($self->get_tag_name(@tag) eq $path[$#path]) {
+				# The next tag closes the last inline tag.
+				# We nned to temporarily remove the tag from
+				# the path before calling breaking_tag
+				my $t = pop @path;
+				if (!$tmpeof and !$self->breaking_tag) {
+					push @path, $t;
+					goto NEXT_TAG;
+				}
+				push @path, $t;
+			}
+		}
 	}
 
 	# This strips the extracted strings
@@ -899,17 +963,8 @@
 	if ( length($self->join_lines(@paragraph)) > 0 ) {
 		my $struc = $self->get_path;
 		my $options = $self->tag_in_list($struc,@{$self->{tags}});
-		my $inlist;
-		if ($options eq 0) {
-			$inlist = 0;
-			$options = "";
-		} elsif ($options eq 1) {
-			$inlist = 1;
-			$options = "";
-		} else {
-			$inlist = 1;
-		}
-		if ( $self->{options}{'tagsonly'} eq $inlist ) {
+		$options = "" if ($options eq 0 or $options eq 1);
+		if ($translate) {
 			# This tag should be translated
 			$self->pushline($self->found_string(
 				$self->join_lines(@paragraph),


More information about the Po4a-devel mailing list