Bug#516129: perl-modules: CGI.pm unwanted UTF-8 conversion in URLs
Niko Tyni
ntyni at debian.org
Sun Feb 22 21:48:45 UTC 2009
On Thu, Feb 19, 2009 at 03:46:19PM +0100, Kiss Gabor (Bitman) wrote:
> > > Function url(-path-info=>1) does not work well if I have ISO-8859-2
> > > accented chars in the URL. Utility function CGI::Util::escape()
> > > unconditionally forces an ISO-8859-1 -> UTF-8 conversion:
> > >
> > > # force bytes while preserving backward compatibility -- dankogai
> > > $toencode = pack("C*", unpack("U0C*", $toencode));
> Unfortunately 3.38 does not work.
OK, thanks.
I must admit I'm a bit confused about the problem. Could you please
give a simple test case (either a command-line version or a CGI script)
with the current result and the one you'd expect?
As far as I can see (looking at 3.29), url(-path-info=>1) will unescape()
the PATH_INFO variable into 8-bit characters and then encode those manually
into URL encoding with sprintf() as the last thing in the url() function.
I can't see CGI::Util::escape() being called here - are you calling
that manually?
I do get your point about the idempotency of course:
% perl -MCGI::Util=escape,unescape -E 'say escape(unescape("%E4"))'
%C3%A4
but it's not clear to me what this breaks, particularly as those aren't
public subroutines.
Sorry if I'm being dense.
--
Niko Tyni ntyni at debian.org
More information about the Perl-maintainers
mailing list