Perl Unicode Cookbook: Unicode Literals by Number
℞ 5: Unicode literals by character number
In an interpolated literal, whether a double-quoted string or a regex, you may specify a character by its number using the \x{HHHHHH}
escape.
String: "\x{3a3}"
Regex: /\x{3a3}/
String: "\x{1d45b}"
Regex: /\x{1d45b}/
# even non-BMP ranges in regex work fine
/[\x{1D434}-\x{1D467}]/
The BMP (or Basic Multilingual Plane, or Plane 0) contains the most common Unicode characters; it covers 0x0000 through 0xFFFD. Characters in other planes are much more specialized. They often include characters of historical interest.
Use Unicode charts to find character numbers, or see the recipe for translating characters to numbers and vice versa.
Previous: ℞ 4: Characters and Their Numbers
Series Index: The Standard Preamble
Tags
Feedback
Something wrong with this article? Help us out by opening an issue or pull request on GitHub