Perl Unicode Cookbook: Decode @ARGV as UTF-8

℞ 13: Decode program arguments as utf8

While the standard Perl Unicode preamble makes Perl’s filehandles use UTF-8 encoding by default, filehandles aren’t the only sources and sinks of data. The command-line arguments to your programs, available through @ARGV, may also need decoding.

You can have Perl handle this operation for you automatically in two ways, and may do it yourself manually. As documented in perldoc perlrun, the -C flag controls Unicode features. Use the A modifier for Perl to treat your arguments as UTF-8 strings:

     $ perl -CA ...

You may, of course, use -C on the shebang line of your programs.

The second approach is to use the PERL_UNICODE environment variable. It takes the same values as the -C flag; to get the same effect as -CA, write:

     $ export PERL_UNICODE=A

You may temporarily disable this automatic Unicode treatment with PERL_UNICODE=0.

Finally, you may decode the contents of @ARGV yourself manually with the Encode module:

    use Encode qw(decode_utf8);
    @ARGV = map { decode_utf8($_, 1) } @ARGV;

Previous: ℞ 12: Explicit encode/decode

Series Index: The Standard Preamble

Next: ℞ 14: Decode @ARGV as Local Encoding

Tags

Feedback

Something wrong with this article? Help us out by opening an issue or pull request on GitHub

TPRF Gold Sponsor
TPRF Silver Sponsor
TPRF Bronze Sponsor