Perl Unicode Cookbook: Decode @ARGV as UTF-8
℞ 13: Decode program arguments as utf8
While the standard Perl Unicode preamble makes Perl’s filehandles use UTF-8 encoding by default, filehandles aren’t the only sources and sinks of data. The command-line arguments to your programs, available through @ARGV
, may also need decoding.
You can have Perl handle this operation for you automatically in two ways, and may do it yourself manually. As documented in perldoc perlrun, the -C
flag controls Unicode features. Use the A
modifier for Perl to treat your arguments as UTF-8 strings:
$ perl -CA ...
You may, of course, use -C
on the shebang line of your programs.
The second approach is to use the PERL_UNICODE
environment variable. It takes the same values as the -C
flag; to get the same effect as -CA
, write:
$ export PERL_UNICODE=A
You may temporarily disable this automatic Unicode treatment with PERL_UNICODE=0
.
Finally, you may decode the contents of @ARGV
yourself manually with the Encode module:
use Encode qw(decode_utf8);
@ARGV = map { decode_utf8($_, 1) } @ARGV;
Previous: ℞ 12: Explicit encode/decode
Series Index: The Standard Preamble
Tags
Feedback
Something wrong with this article? Help us out by opening an issue or pull request on GitHub