Bug#619214: perldoc -f length wrong about characters

Russ Allbery rra at debian.org
Tue Mar 22 01:59:14 UTC 2011


jidanni at jidanni.org writes:

> It says

>                Note the characters: if the EXPR is in Unicode, you will
>                get the number of characters, not the number of bytes.

> But I prove it wrong below.

> $ perl -wle 'print length "網路;"'
> 9
> $ perl -wle 'print length "網路"'
> 6

Those aren't Unicode strings.  They're strings of bytes, and hence the
behavior is as described in the documentation.  If you want to put a
literal Unicode string into Perl source, you have to add "use utf8" to
change the default interpretation, after which you get the behavior that
you're expecting.

windlord:~> perl -wle 'use utf8; print length "網路;"'
3
windlord:~> perl -wle 'use utf8; print length "網路"'
2

-- 
Russ Allbery (rra at debian.org)               <http://www.eyrie.org/~eagle/>






More information about the Perl-maintainers mailing list