Bug#282503: Documenting / closing the bug

Gunnar Wolf Gunnar Wolf <gwolf@gwolf.org>, 282503@bugs.debian.org
Wed, 15 Jun 2005 10:28:30 -0500


tag 282503 + patch
thanks

As Allard noted, this bug would be quite hard to correct, as the
program's logic really does not take tags into account when stripping
whitespace. 

I took the 'documented bugs become features' approach, and properly
documented this, so it will not bite any more innocent people ;-) Here
is the difference. I uploaded it already to unstable.

--- lib/HTML/Clean.pm   (revision 1171)
+++ lib/HTML/Clean.pm   (revision 1172)
@@ -375,6 +375,16 @@
 
 =back
 
+Please note that if your HTML includes preformatted regions (this means, if
+it includes <pre>...</pre>, we do not suggest removing whitespace, as it will
+alter the rendered defaults. 
+
+HTML::Clean will print out a warning if it finds a preformatted region and is 
+requested to strip whitespace. In order to prevent this, specify that you don't
+want to strip whitespace - i.e.
+
+  $h->strip( {whitespace => 0} );
+
 =cut
 
 use vars qw/
@@ -435,6 +445,17 @@
   }
 
   if ($do_whitespace) {
+    if ($$h =~ /<pre/i) {
+       warn << 'EOF'
+Warning: Stripping whitespace will affect preformatted region\'s layout
+You have a <pre> region in your HTML, which depends on the whitespace not
+being modified. You requested to strip the whitespace - The rendered results
+will be affected.
+
+Hint: Use $h->strip({whitespace => 0}); instead.
+EOF
+    }
+
     $$h =~ s,[\r\n]+,\n,sg; # Carriage/LF -> LF
     $$h =~ s,\s+\n,\n,sg;   # empty line
     $$h =~ s,\n\s+<,\n<,sg; # space before tag


-- 
Gunnar Wolf - gwolf@gwolf.org - (+52-55)1451-2244 / 5623-0154
PGP key 1024D/8BB527AF 2001-10-23
Fingerprint: 0C79 D2D1 2C4E 9CE4 5973  F800 D80E F35A 8BB5 27AF