Perl Unicode Cookbook: Unicode Named Character Sequences
℞ 9: Unicode named sequences
Unicode includes the feature of named character sequences, which combine multiple Unicode characters behind a single name. The charnames pragma allows the use of these named sequences in literals, just as it allows the use of Unicode named characters in literals.
In Perl, these named character sequences look just like character names but return multiple codepoints. Notice the %vx
vector-print behavior of printf
:
use charnames qw(:full);
my $seq = "\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}";
printf "U+%v04X\n", $seq;
U+0100.0300
While each version of Unicode may update the official list of named sequences, the latest version of the Unicode Named Sequences data file is always available. Perl 5.14 supports Unicode 6.0, and Perl 5.16 will support Unicode 6.1.
Previous: ℞ 8: Unicode Named Characters
Series Index: The Standard Preamble
Tags
Feedback
Something wrong with this article? Help us out by opening an issue or pull request on GitHub