Perl Unicode Cookbook: Unicode Named Characters

Apr 16, 2012 by Tom Christiansen

℞ 8: Unicode named characters

Use the \N{charname} notation to get the character by that name for use in interpolated literals (double-quoted strings and regexes). In v5.16, there is an implicit

 use charnames qw(:full :short);

But prior to v5.16, you must be explicit about which set of charnames you want. The :full names are the oﬃcial Unicode character name, alias, or sequence, which all share a namespace.

 use charnames qw(:full :short latin greek);

 "\N{MATHEMATICAL ITALIC SMALL N}"      # :full
 "\N{GREEK CAPITAL LETTER SIGMA}"       # :full

Anything else is a Perl-speciﬁc convenience abbreviation. Specify one or more scripts by names if you want short names that are script-speciﬁc.

 "\N{Greek:Sigma}"                      # :short
 "\N{ae}"                               #  latin
 "\N{epsilon}"                          #  greek

The v5.16 release also supports a :loose import for loose matching of character names, which works just like loose matching of property names: that is, it disregards case, whitespace, and underscores:

 "\N{euro sign}"                        # :loose (from v5.16)

(You do not have to use the charnames pragma to interpolate Unicode characters by number into literals with the \N{...} sequence.)

Previous: ℞ 7: Get Character Number by Name

Series Index: The Standard Preamble

Next: ℞ 9: Unicode Named Character Sequences

Tags

unicode

Tom Christiansen

Browse their articles

Feedback

Something wrong with this article? Help us out by opening an issue or pull request on GitHub

TPRF Gold Sponsor

TPRF Silver Sponsor

TPRF Bronze Sponsor

Perl Resources

Perl Unicode Cookbook: Unicode Named Characters

℞ 8: Unicode named characters

Tom Christiansen

Browse their articles

Feedback

Site Map

Contact Us

License

Legal