[DRE-maint] Bug#534721: libhpricot-ruby1.8: Hpricot's XML parser fails to parse simple, valid XML

Gunnar Wolf gwolf at gwolf.org
Mon Jun 29 17:57:24 UTC 2009


Ryan Niebur dijo [Sun, Jun 28, 2009 at 01:54:09AM -0700]:
> > Package: libhpricot-ruby1.8
> > Version: 0.8-2
> > Severity: grave
> > Justification: renders package unusable
> 
> Sorry about this. I have sent an email to the upstream author asking
> for input. However all of my scripts (tho I only use hpricot for
> simple HTML parsing) are still in working condition afaict. out of
> curiosity, Gunnar, has the new version of hpricot negatively affected
> dh-make-drupal at all?

Umh, I have noticed no breakage here, and I am using 0.8-2. In fact,
many of the cases presented by the submitter appear different here:

First problem, where an extra </zzzz> appeared, does not happen here:

$ ruby -e "require 'hpricot'; print Hpricot.XML('<aaaa></aaaa>')"
<aaaa></aaaa>
$ ruby -e "require 'hpricot'; print Hpricot.XML('<zzzz></zzzz>')"
<zzzz></zzzz>

Handling malformed XML behaves as reported - I don't know whether to
consider a bug misparsing what cannot be parsed, though:

$ ruby -e "require 'hpricot'; print Hpricot.XML('<a></b>')"
<a></b></a>
$ ruby -e "require 'hpricot'; print Hpricot.XML('<a>b')"
<a>b</a>

This appears to work fine:

$ ruby -e "require 'hpricot'; print Hpricot.XML('<zzzz></zzzz>').search('/zzzz')"
<zzzz></zzzz>
$ ruby -e "require 'hpricot'; print Hpricot.XML('<zzzz></zzzz>').search('/zzzz/zzzz')"

The nesting is not broken here:

$ ruby -e "require 'hpricot'; print Hpricot.XML('<a><zzzz></zzzz><b></b></a>')"
<a><zzzz></zzzz><b></b></a>
$ ruby -e "require 'hpricot'; print Hpricot.XML('<a><zzzz></zzzz><b></b></a>').search('/a/b')"
<b></b>
$ ruby -e "require 'hpricot'; print Hpricot.XML('<a><zzzz></zzzz><b></b></a>').search('/a/zzzz/b')"

This last case behaves as reported:

$ ruby -e "require 'hpricot'; print Hpricot.XML('<a></b>')"
<a></b></a>

But, again, if the input is invalid, I cannot expect the output to
make sense.

-- 
Gunnar Wolf • gwolf at gwolf.org • (+52-55)5623-0154 / 1451-2244






More information about the Pkg-ruby-extras-maintainers mailing list