Bug#516129: perl-modules: CGI.pm unwanted UTF-8 conversion in URLs

Niko Tyni ntyni at debian.org
Sun Feb 22 21:48:45 UTC 2009


On Thu, Feb 19, 2009 at 03:46:19PM +0100, Kiss Gabor (Bitman) wrote:
 
> > > Function url(-path-info=>1) does not work well if I have ISO-8859-2
> > > accented chars in the URL. Utility function CGI::Util::escape()
> > > unconditionally forces an ISO-8859-1 -> UTF-8 conversion:
> > > 
> > >   # force bytes while preserving backward compatibility -- dankogai
> > >   $toencode = pack("C*", unpack("U0C*", $toencode));

> Unfortunately 3.38 does not work.

OK, thanks.

I must admit I'm a bit confused about the problem. Could you please
give a simple test case (either a command-line version or a CGI script)
with the current result and the one you'd expect?

As far as I can see (looking at 3.29), url(-path-info=>1) will unescape()
the PATH_INFO variable into 8-bit characters and then encode those manually
into URL encoding with sprintf() as the last thing in the url() function.

I can't see CGI::Util::escape() being called here - are you calling
that manually?

I do get your point about the idempotency of course:

% perl -MCGI::Util=escape,unescape -E 'say escape(unescape("%E4"))'  
%C3%A4

but it's not clear to me what this breaks, particularly as those aren't
public subroutines.

Sorry if I'm being dense.
-- 
Niko Tyni   ntyni at debian.org






More information about the Perl-maintainers mailing list