Feature | Syntax | Description | Example | .NET | Java | Perl | PCRE | PCRE2 | PHP | Delphi | R | JavaScript | VBScript | XRegExp | Python | Ruby | std::regex | Boost | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | Oracle | XML | XPath |
Character class |
[ |
When used outside a character class, [ begins a character class. Inside a character class, different rules apply. Unless otherwise noted, the syntax on this page is only valid inside character classes, while the syntax on all other reference pages is not valid inside character classes. |
|
YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES |
Literal character |
Any character except ^-]\ |
All characters except the listed special characters are literal characters that add themselves to the character class. |
[abc] matches a , b or c |
YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES |
Backslash escapes a metacharacter |
\ (backslash) followed by any of ^-]\ |
A backslash escapes special characters to suppress their special meaning. |
[\^\]] matches ^ or ] |
YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | ECMA | ECMA awk | YES | no | no | no | no | no | YES | YES |
Literal backslash |
\ |
A backslash is a literal character that adds a backslash to the character class. |
[\] matches \ |
no | no | no | no | no | no | no | no | no | no | no | no | no | basic extended grep egrep awk | basic extended grep egrep | no | YES | YES | YES | YES | YES | no | no |
Range |
- (hyphen) between two tokens that each specify a single character. |
Adds a range of characters to the character class. |
[a-zA-Z0-9] matches any ASCII letter or digit |
YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES |
Negated character class |
^ (caret) immediately after the opening [ |
Negates the character class, causing it to match a single character not listed in the character class. |
[^a-d] matches x (any character except a, b, c or d) |
YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES |
Literal opening bracket |
[ |
An opening square bracket is a literal character that adds an opening square bracket to the character class. |
[ab[cd]ef] matches aef] , bef] , [ef] , cef] , and def] |
YES | no | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | YES | YES | YES | YES | YES | YES | YES | YES | no | no |
Nested character class |
[ |
An opening square bracket inside a character class begins a nested character class. |
[ab[cd]ef] is the same as [abcdef] and matches any letter between a and f . |
no | YES | no | no | no | no | no | no | no | no | no | no | 1.9 | no | no | no | no | no | no | no | no | no | no |
Character class subtraction |
[base-[subtract]] |
Removes all characters in the “subtract” class from the “base” class. |
[a-z-[aeiuo]] matches a single letter that is not a vowel. |
2.0–7.0 | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | YES | YES |
Character class intersection |
[base&&[intersect]] |
Reduces the character class to the characters present in both “base” and “intersect”. |
[a-z&&[^aeiuo]] matches a single letter that is not a vowel. |
no | YES | no | no | no | no | no | no | no | no | no | no | 1.9 | no | no | no | no | no | no | no | no | no | no |
Character class intersection |
[base&&intersect] |
Reduces the character class to the characters present in both “base” and “intersect”. |
[\p{Nd}&&\p{InThai}] matches a single Thai digit. |
no | YES | no | no | no | no | no | no | no | no | no | no | 1.9 | no | no | no | no | no | no | no | no | no | no |
Character escape |
\n , \r and \t |
Add an LF character, a CR character, or a tab character to the character class, respectively. |
[\n\r\t] a line feed, a carriage return, or a tab. |
YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | ECMA awk | ECMA awk | YES | string | string | string | string | no | YES | YES |
Character escape |
\a |
Add the “alert” or “bell” control character (ASCII 0x07) to the character class. |
[\a\t] matches a bell or a tab character. |
YES | YES | YES | YES | YES | YES | YES | YES | no | no | no | YES | YES | awk | ECMA awk | YES | no | no | no | no | no | no | no |
Character escape |
\b |
Add the “backspace” control character (ASCII 0x08) to the character class. |
[\b\t] matches a backspace or a tab character. |
YES | no | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | ECMA VC’12–VC’15 awk VC’08–VC’22 | ECMA awk | YES | no | no | no | no | no | YES | YES |
Character escape |
\B |
Add a backslash to the character class. |
[\B] matches \ |
no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | YES | no | no | no | no | no | no | no |
Character escape |
\e |
Add the “escape” control character (ASCII 0x1B) to the character class. |
[\e\t] matches an escape or a tab character. |
YES | YES | YES | YES | YES | YES | YES | YES | no | no | no | no | YES | no | ECMA awk | YES | no | no | no | no | no | no | no |
Character escape |
\f |
Add the “form feed” control character (ASCII 0x0C) to the character class. |
[\f\t] matches a form feed or a tab character. |
YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | ECMA awk | ECMA awk | YES | no | no | no | no | no | no | no |
Character escape |
\v |
Add the “vertical tab” control character (ASCII 0x0B) to the character class, without adding any other vertical whitespace. |
[\v\t] matches a vertical tab or a tab character. |
YES | 4–7 | no | no | no | no | no | no | YES | YES | YES | YES | YES | ECMA awk | ECMA awk | YES | no | no | no | no | no | no | no |
POSIX class |
[:alpha:] |
Matches one character from a POSIX character class. Can only be used in a bracket expression. |
[[:digit:][:lower:]] matches one of 0 through 9 or a through z |
no | no | Unicode | ASCII | ASCII | 5.3.4 Unicode 5.0.0 code page | ASCII | ASCII | no | no | no | no | 1.9 Unicode 1.8 ASCII | Unicode | Unicode | Unicode | ASCII | ASCII | ASCII | ASCII | Unicode | no | no |
POSIX class |
[:^alpha:] |
Matches one character that is not part of a specific POSIX character class. Can only be used in a bracket expression. |
[5[:^digit:]] matches the digit 5 or any other character that is not a digit. |
no | no | YES | YES | YES | YES | YES | YES | no | no | no | 3.7–3.10 error | 1.9 | error | YES | error | error | error | error | error | error | no | no |
POSIX shorthand class |
[:d:] , [:s:] , [:w:] |
Matches one character from the POSIX character classes “digit”, “space”, or “word”. Can only be used in a bracket expression. |
[[:s:][:d:]] matches a space, a tab, a line break, or one of 0 through 9 |
no | no | no | no | no | no | no | no | no | no | no | no | no | Unicode | Unicode | no | no | no | no | no | no | no | no |
POSIX shorthand class |
[:l:] and [:u:] |
Matches one character from the POSIX character classes “lower” or “upper”. Can only be used in a bracket expression. |
[[:u:]][[:l:]] matches Aa but not aA . |
no | no | no | no | no | no | no | no | no | no | no | no | no | no | Unicode | no | no | no | no | no | no | no | no |
POSIX shorthand class |
[:h:] |
Matches one character from the POSIX character classes “blank”. Can only be used in a bracket expression. |
[[:h:]] matches a space. |
no | no | no | no | no | no | no | no | no | no | no | no | no | no | 1.42–1.83 Unicode | no | no | no | no | no | no | no | no |
POSIX shorthand class |
[:V:] |
Matches a vertical whitespace character. Can only be used in a bracket expression. |
[[:v:]] match any single vertical whitespace character. |
no | no | no | no | no | no | no | no | no | no | no | no | no | no | 1.42–1.83 Unicode | no | no | no | no | no | no | no | no |
POSIX class |
Any supported \p{…} syntax |
\p{…} syntax can be used inside character classes. |
[\p{Digit}\p{Lower}] matches one of 0 through 9 or a through z |
n/a | 9 | YES | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | 1.9 | n/a | extended egrep | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
POSIX class |
\p{Alpha} |
Matches one character from a POSIX character class. |
\p{Digit} matches any single digit. |
no | ASCII | Unicode | no | no | no | no | no | no | no | no | no | 1.9 Unicode | no | ECMA extended egrep awk Unicode | no | no | no | no | no | no | no | no |
POSIX class |
\p{IsAlpha} |
Matches one character from a POSIX character class. |
\p{IsDigit} matches any single digit. |
no | 9 Unicode 4 ASCII | Unicode | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no |
POSIX collation sequence |
[.span-ll.] |
Matches a POSIX collation sequence. Can only be used in a bracket expression. |
[[.span-ll.]] matches ll in the Spanish locale |
no | no | error | error | error | error | error | error | no | no | no | 3.7–3.10 error | 1.8 only error | fail | YES | YES | YES | YES | YES | YES | YES | no | no |
POSIX character equivalence |
[=x=] |
Matches a POSIX character equivalence. Can only be used in a bracket expression. |
[[=e=]] matches e , é , è and ê in the French locale |
no | no | error | error | error | error | error | error | no | no | no | 3.7–3.10 error | 1.8 only error | YES | YES | YES | YES | YES | YES | YES | YES | no | no |
Feature | Syntax | Description | Example | .NET | Java | Perl | PCRE | PCRE2 | PHP | Delphi | R | JavaScript | VBScript | XRegExp | Python | Ruby | std::regex | Boost | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | Oracle | XML | XPath |
---|