[Po4a-devel] Fuzzing results

intrigeri intrigeri at boum.org
Thu Nov 13 17:24:00 UTC 2008


Hi,

Nicolas François wrote (12 Nov 2008 20:10:39 GMT) :
> If you can produce a infinite loop, can you ask zzuf to reproduce
> this test vector?

Yes. zzuf's behaviour is deterministic, so this should be reproducible
anywhere, at least using zzuf 0.12-1.

In order to isolate reproducible cases, I re-did my tests against the
updated po4a CVS, using some more broadly available input files.

BTW, I had to fix the version in TransTractor.pm to be able to build
a Debian package from the CVS source (I would understand you might not
want to change the version number to 0.35 until it is
released, though.)

>> ,----
>> |  po4a-gettextize
>> `----
>> 
>> Without specifying the input charset, zzuf'ed po4a-gettextize quickly
>> errors out, complaining it was not able to detect the input charset;
>> no incomplete file is left on disk.
>> 
>> So I had to pretend the input was in UTF-8, as does ikiwiki's po plugin.
>> 
>> Two ways of crashing were revealed by this command-line:
>> 
>>         zzuf -vc -s 0:100 -r 0.1:0.5 \
>>           po4a-gettextize -f text -o markdown -M utf-8 -L utf-8 \
>>             -m LICENSES >/dev/null
>> 
>> They are:
>> 
>> Malformed UTF-8 character (UTF-16 surrogate 0xdcc9) in substitution iterator at
> /usr/share/perl5/Locale/Po4a/Po.pm line 1443.
>> Malformed UTF-8 character (fatal) at /usr/share/perl5/Locale/Po4a/Po.pm line 1443.
>> 
>> and
>> 
>> Malformed UTF-8 character (UTF-16 surrogate 0xdcec) in substitution (s///) at
> /usr/share/perl5/Locale/Po4a/Po.pm line 1443.
>> Malformed UTF-8 character (fatal) at /usr/share/perl5/Locale/Po4a/Po.pm line 1443.
>> 
>> Perl seems to exit cleanly, and an incomplete PO file is written on
>> disk. I not sure if this is a bug in Perl or in Po.pm.

> I'm not sure I can catch this one. A proper error message indicating the
> line number in the input document would be preferable.

Reproducible test case:

  zzuf -c -s 13 -r 0.1 \
    po4a-gettextize -f text -o markdown -M utf-8 -L utf-8 \
     -m GPL-3 -p GPL-3.pot

Crashes with:

  Malformed UTF-8 character (UTF-16 surrogate 0xdfa4) in substitution iterator at /usr/share/perl5/Locale/Po4a/Po.pm line 1449.
  Malformed UTF-8 character (fatal) at /usr/share/perl5/Locale/Po4a/Po.pm line 1449.

An incomplete pot file is left on disk. Unfortunately Po.pm tells us
nothing about the place where the crash happens.

>> ,----
>> |  po4a-translate
>> `----
>> 
>> Without specifying an input charset, same behaviour as
>> po4a-gettextize, so let's specify UTF-8 as input charset as of now.
>> 
>> The command:
>> 
>>         zzuf -cv \
>>           po4a-translate -d -f text -o markdown -M utf-8 -L utf-8 \
>>             -k 0 -m LICENSES -p LICENSES.fr.po -l test.fr
>> 
>> ... prints tons of occurences of the following error, but a complete
>> translated document is written (obviously with some weird chars
>> inside):

> I fixed these ones.

Confirmed.

>> While:
>> 
>>         zzuf -cv -s 0:10 -r 0.001:0.3 \
>>           po4a-translate -d -f text -o markdown -M utf-8 -L utf-8 \
>>             -k 0 -m LICENSES -p LICENSES.fr.po -l test.fr
>> 
>> ... seems to lose the fight, at the readpo(LICENSES.fr.po) step,
>> against some kind of infinite loop, deadlock, or any similar beast.
>> Seems like it could go on using CPU power forever, but memory use does
>> not increase.
>> 
>> Whatever format module is used does not change anything. This is thus
>> probably a bug in po4a's core or in a lib it depends on.

> This looks better now, but po4a reports that errors were found in the PO
> file (not really surprising).

> The current po4a behavior is to continue, but report in case of errors.
> I could also count the number of errors and die after a certain amount.

> I did not experience infinite loops or deadlocks

Ok, I managed to find a reproducible test that should be run at the
root of the current po4a CVS source:

zzuf -I po/pod/ po4a po/pod.cfg

I let it run (and use one of my two CPU cores) for a while, and lost
patience, considering it was really deadlocked or lost inside an
infinite loop.

The last printed error message was (sorry for the French language,
but I guess you will understand it :)

po/pod/po4a-pod.pot:1340: (po4a::po)
               Ligne étrange : -->"  8\n&<--

But this test seems to use the pod and man modules, so it doesn't
demonstrate any issue in the core.

So I wrote the following simple Perl script:

#!/usr/bin/perl
use warnings;
use strict;
use Locale::Po4a::Chooser;
use Locale::Po4a::Po;
my (@pos, at masters);
push @pos,"po/pod/fr.po";
push @masters,"po4a-updatepo";
my $doc=Locale::Po4a::Chooser::new('text');
$doc->process(
	      'po_in_name'	=> \@pos,
	      'file_in_name'	=> \@masters,
	      'file_in_charset'  => 'utf-8',
	      'file_out_charset' => 'utf-8',
	     );
$doc->write("/tmp/test");

Running "zzuf -I po/pod/ this_script.pl" at the root of the CVS source
reproduces this deadlock / infinite loop behavior after processing
po/pod/fr.po:1744.

I tried the same using other format modules, same result, so it seems
this is the same issue in readpo() I previously detected. The good
news is: we can now reproduce it easily.

Bye,
--
  intrigeri <intrigeri at boum.org>
  | gnupg key @ https://gaffer.ptitcanardnoir.org/intrigeri/intrigeri.asc
  | So what?



More information about the Po4a-devel mailing list