Perl Unicode Cookbook: Unicode Named Characters

℞ 8: Unicode named characters

Use the \N{charname} notation to get the character by that name for use in interpolated literals (double-quoted strings and regexes). In v5.16, there is an implicit

 use charnames qw(:full :short);

But prior to v5.16, you must be explicit about which set of charnames you want. The :full names are the official Unicode character name, alias, or sequence, which all share a namespace.

 use charnames qw(:full :short latin greek);

 "\N{MATHEMATICAL ITALIC SMALL N}"      # :full
 "\N{GREEK CAPITAL LETTER SIGMA}"       # :full

Anything else is a Perl-specific convenience abbreviation. Specify one or more scripts by names if you want short names that are script-specific.

 "\N{Greek:Sigma}"                      # :short
 "\N{ae}"                               #  latin
 "\N{epsilon}"                          #  greek

The v5.16 release also supports a :loose import for loose matching of character names, which works just like loose matching of property names: that is, it disregards case, whitespace, and underscores:

 "\N{euro sign}"                        # :loose (from v5.16)

(You do not have to use the charnames pragma to interpolate Unicode characters by number into literals with the \N{...} sequence.)

Previous: ℞ 7: Get Character Number by Name

Series Index: The Standard Preamble

Next: ℞ 9: Unicode Named Character Sequences

Tags

Feedback

Something wrong with this article? Help us out by opening an issue or pull request on GitHub

TPRF Gold Sponsor
TPRF Silver Sponsor
TPRF Bronze Sponsor