Perl 的大小写转换转义字符也在替换文本中运作。最常见的用法是改变内插变量的大小写。\U 会将所有内容转换为大写，直到下一个 \L 或 \E 为止。\L 会将所有内容转换为小写，直到下一个 \U 或 \E 为止。\u 会将下一个字符转换为大写。\l 会将下一个字符转换为小写。您可以将它们组合成 \l\U，将第一个字符转换为小写，其余转换为大写，或 \u\L，将第一个字符转换为大写，其余转换为小写。\E 会关闭大小写转换。您不能在 \U 或 \L 之后使用 \u 或 \l，除非您先使用 \E 停止串行。

当正则表达式 (?i)(helló) (wórld) 符合 HeLlÓ WóRlD 时，替换文本 \U\l$1\E \L\u$2 会变成 hELLÓ Wórld。文本字面也会受到影响。 \U$1 Dear $2 会变成 HELLÓ DEAR WÓRLD。

Perl 的大小写转换也适用于正则表达式。但它并非以您预期的方式运作。Perl 会在解析脚本中的字符串和内插变量时套用大小写转换。这对于替换文本中的反向引用非常有用，因为它们在 Perl 中实际上是内插变量。但正则表达式中的反向引用是正则表达式符号，而不是变量。 (?-i)(a)\U\1 符合 aa，但不符合 aA。 \1 在解析正则表达式时会转换为大写，而不是在比对过程中。由于 \1 不包含任何字母，因此这不会产生任何效果。在正则表达式 \U\w 中， \w 在解析正则表达式时会转换为大写。这表示 \U\w 与 \W 相同，后者会符合任何非单字字符的字符。

Boost 的替换字符串大小写转换

Boost 在使用缺省替换格式或「全部」替换格式时，支持替换字符串中的大小写转换。 \U 会将所有内容转换为大写，直到下一个 \L 或 \E。 \L 会将所有内容转换为小写，直到下一个 \U 或 \E。 \u 会将下一个字符转换为大写。 \l 会将下一个字符转换为小写。 \E 会关闭大小写转换。与 Perl 一样，大小写转换会影响替换字符串中的文本字面和反向引用插入的文本。

Boost 与 Perl 的不同之处在于，组合这些需求必须反过来运行。 \U\l 会将第一个字符设为小写，其余设为大写。 \L\u 会将第一个字符设为大写，其余设为小写。 Boost 也允许在 \U 顺序中使用 \l，以及在 \L 顺序中使用 \u。因此，当 (?i)(helló) (wórld) 与 HeLlÓ WóRlD 相符时，您可以使用 \L\u\1 \u\2 将相符项目替换为 Helló Wórld。

PCRE2 的替换字符串大小写转换

PCRE2 在使用 PCRE2_SUBSTITUTE_EXTENDED 时，支持替换字符串中的大小写转换。 \U 会将其后所有内容转换为大写。 \L 会将其后所有内容转换为小写。 \u 会将下一个字符转换为大写。 \l 会将下一个字符转换为小写。 \E 会关闭大小写转换。与 Perl 一样，大小写转换会影响替换字符串中的文本字面值和由反向引用插入的文本。

与 Perl 不同，在 PCRE2 中，\U、\L、\u 和 \l 都会停止任何前置大小写转换。因此，您无法组合 \L 和 \u，例如将第一个字符设为大写，其余设为小写。 \L\u 会将第一个字符设为大写，其余不变，就像 \u 一样。 \u\L 会将所有字符设为小写，就像 \L 一样。

在 PCRE2 中，大小写转换会运行条件。条件之前生效的任何大小写转换也会套用于条件。如果条件包含其自身大小写转换逸出，在实际使用的条件部分，则这些逸出会在条件之后保持有效。因此，您可以使用 ${1:+\U:\L}${2} 在第一个群组参与时，以大写插入第二个捕获组相符的文本，如果它没有参与，则以小写插入。

R 的反向引用大小写转换

R 中的 sub() 和 gsub() 函数支持受 Perl 字符串启发的转换大小写转义字符。 \U 将所有反向引用转换为大写，直到下一个 \L 或 \E 为止。 \L 将所有反向引用转换为小写，直到下一个 \U 或 \E 为止。 \E 关闭大小写转换。

当正则表达式 (?i)(Helló) (Wórld) 与 HeLlÓ WóRlD 相符时，替换字符串 \U$1 \L$2 会变成 HELLÓ wórld。文本常数不受影响。 \U$1 Dear $2 会变成 HELLÓ Dear WÓRLD。

關於正規表示式 » 替換字串教學 » 替換文字大小寫轉換

替換文字教學

本網站更多內容

替換文字大小寫轉換

有些應用程式可以插入由正規表示式或擷取群組配對的文字，並轉換為大寫或小寫。

Perl 字串功能在正規表示式和替換文字中

Perl 中正規表示式和替換文字的雙斜線和三斜線表示法支援雙引號字串的所有功能。最明顯的是變數內插。您只要在替換文字中使用正規表示式相關變數，即可插入由正規表示式或擷取群組配對的文字。

Perl 的大小寫轉換跳脫字元也在替換文字中運作。最常見的用法是改變內插變數的大小寫。\U 會將所有內容轉換為大寫，直到下一個 \L 或 \E 為止。\L 會將所有內容轉換為小寫，直到下一個 \U 或 \E 為止。\u 會將下一個字元轉換為大寫。\l 會將下一個字元轉換為小寫。您可以將它們組合成 \l\U，將第一個字元轉換為小寫，其餘轉換為大寫，或 \u\L，將第一個字元轉換為大寫，其餘轉換為小寫。\E 會關閉大小寫轉換。您不能在 \U 或 \L 之後使用 \u 或 \l，除非您先使用 \E 停止序列。

當正規表示式 (?i)(helló) (wórld) 符合 HeLlÓ WóRlD 時，替換文字 \U\l$1\E \L\u$2 會變成 hELLÓ Wórld。文字字面也會受到影響。 \U$1 Dear $2 會變成 HELLÓ DEAR WÓRLD。

Perl 的大小寫轉換也適用於正規表示式。但它並非以您預期的方式運作。Perl 會在解析指令碼中的字串和內插變數時套用大小寫轉換。這對於替換文字中的反向參照非常有用，因為它們在 Perl 中實際上是內插變數。但正規表示式中的反向參照是正規表示式符號，而不是變數。 (?-i)(a)\U\1 符合 aa，但不符合 aA。 \1 在解析正規表示式時會轉換為大寫，而不是在比對過程中。由於 \1 不包含任何字母，因此這不會產生任何效果。在正規表示式 \U\w 中， \w 在解析正規表示式時會轉換為大寫。這表示 \U\w 與 \W 相同，後者會符合任何非單字字元的字元。

Boost 的替換字串大小寫轉換

Boost 在使用預設替換格式或「全部」替換格式時，支援替換字串中的大小寫轉換。 \U 會將所有內容轉換為大寫，直到下一個 \L 或 \E。 \L 會將所有內容轉換為小寫，直到下一個 \U 或 \E。 \u 會將下一個字元轉換為大寫。 \l 會將下一個字元轉換為小寫。 \E 會關閉大小寫轉換。與 Perl 一樣，大小寫轉換會影響替換字串中的文字字面和反向參照插入的文字。

Boost 與 Perl 的不同之處在於，組合這些需求必須反過來執行。 \U\l 會將第一個字元設為小寫，其餘設為大寫。 \L\u 會將第一個字元設為大寫，其餘設為小寫。 Boost 也允許在 \U 順序中使用 \l，以及在 \L 順序中使用 \u。因此，當 (?i)(helló) (wórld) 與 HeLlÓ WóRlD 相符時，您可以使用 \L\u\1 \u\2 將相符項目替換為 Helló Wórld。

PCRE2 的替換字串大小寫轉換

PCRE2 在使用 PCRE2_SUBSTITUTE_EXTENDED 時，支援替換字串中的大小寫轉換。 \U 會將其後所有內容轉換為大寫。 \L 會將其後所有內容轉換為小寫。 \u 會將下一個字元轉換為大寫。 \l 會將下一個字元轉換為小寫。 \E 會關閉大小寫轉換。與 Perl 一樣，大小寫轉換會影響替換字串中的文字字面值和由反向參照插入的文字。

與 Perl 不同，在 PCRE2 中，\U、\L、\u 和 \l 都會停止任何前置大小寫轉換。因此，您無法組合 \L 和 \u，例如將第一個字元設為大寫，其餘設為小寫。 \L\u 會將第一個字元設為大寫，其餘不變，就像 \u 一樣。 \u\L 會將所有字元設為小寫，就像 \L 一樣。

在 PCRE2 中，大小寫轉換會執行條件。條件之前生效的任何大小寫轉換也會套用於條件。如果條件包含其自身大小寫轉換逸出，在實際使用的條件部分，則這些逸出會在條件之後保持有效。因此，您可以使用 ${1:+\U:\L}${2} 在第一個群組參與時，以大寫插入第二個擷取群組相符的文字，如果它沒有參與，則以小寫插入。

R 的反向參照大小寫轉換

R 中的 sub() 和 gsub() 函數支援受 Perl 字串啟發的轉換大小寫跳脫字元。 \U 將所有反向參照轉換為大寫，直到下一個 \L 或 \E 為止。 \L 將所有反向參照轉換為小寫，直到下一個 \U 或 \E 為止。 \E 關閉大小寫轉換。

當正規表示式 (?i)(Helló) (Wórld) 與 HeLlÓ WóRlD 相符時，替換字串 \U$1 \L$2 會變成 HELLÓ wórld。文字常數不受影響。 \U$1 Dear $2 會變成 HELLÓ Dear WÓRLD。

About Regular Expressions » Replacement Strings Tutorial » Replacement Text Case Conversion

Replacement Text Tutorial

Introduction

Characters

Non-Printable Characters

Replacement Text Case Conversion

Some applications can insert the text matched by the regex or by capturing groups converted to uppercase or lowercase.

Perl String Features in Regular Expressions and Replacement Texts

The double-slashed and triple-slashed notations for regular expressions and replacement texts in Perl support all the features of double-quoted strings. Most obvious is variable interpolation. You can insert the text matched by the regex or capturing groups simply by using the regex-related variables in your replacement text.

Perl’s case conversion escapes also work in replacement texts. The most common use is to change the case of an interpolated variable. \U converts everything up to the next \L or \E to uppercase. \L converts everything up to the next \U or \E to lowercase. \u converts the next character to uppercase. \l converts the next character to lowercase. You can combine these into \l\U to make the first character lowercase and the remainder uppercase, or \u\L to make the first character uppercase and the remainder lowercase. \E turns off case conversion. You cannot use \u or \l after \U or \L unless you first stop the sequence with \E.

When the regex (?i)(helló) (wórld) matches HeLlÓ WóRlD the replacement text \U\l$1\E \L\u$2 becomes hELLÓ Wórld. Literal text is also affected. \U$1 Dear $2 becomes HELLÓ DEAR WÓRLD.

Perl’s case conversion works in regular expressions too. But it doesn’t work the way you might expect. Perl applies case conversion when it parses a string in your script and interpolates variables. That works great with backreferences in replacement texts, because those are really interpolated variables in Perl. But backreferences in the regular expression are regular expression tokens rather than variables. (?-i)(a)\U\1 matches aa but not aA. \1 is converted to uppercase while the regex is parsed, not during the matching process. Since \1 does not include any letters, this has no effect. In the regex \U\w, \w is converted to uppercase while the regex is parsed. This means that \U\w is the same as \W, which matches any character that is not a word character.

Boost’s Replacement String Case Conversion

Boost supports case conversion in replacement strings when using the default replacement format or the “all” replacement format. \U converts everything up to the next \L or \E to uppercase. \L converts everything up to the next \U or \E to lowercase. \u converts the next character to uppercase. \l converts the next character to lowercase. \E turns off case conversion. As in Perl, the case conversion affects both literal text in your replacement string and the text inserted by backreferences.

where Boost differs from Perl is that combining these needs to be done the other way around. \U\l makes the first character lowercase and the remainder uppercase. \L\u makes the first character uppercase and the remainder lowercase. Boost also allows \l inside a \U sequence and a \u inside a \L sequence. So when (?i)(helló) (wórld) matches HeLlÓ WóRlD you can use \L\u\1 \u\2 to replace the match with Helló Wórld.

PCRE2’s Replacement String Case Conversion

PCRE2 supports case conversion in replacement strings when using PCRE2_SUBSTITUTE_EXTENDED. \U converts everything that follows to uppercase. \L converts everything that follows to lowercase. \u converts the next character to uppercase. \l converts the next character to lowercase. \E turns off case conversion. As in Perl, the case conversion affects both literal text in your replacement string and the text inserted by backreferences.

Unlike in Perl, in PCRE2 \U, \L, \u, and \l all stop any preceding case conversion. So you cannot combine \L and \u, for example, to make the first character uppercase and the remainder lowercase. \L\u makes the first character uppercase and leaves the rest unchanged, just like \u. \u\L makes all characters lowercase, just like \L.

In PCRE2, case conversion runs through conditionals. Any case conversion in effect before the conditional also applies to the conditional. If the conditional contains its own case conversion escapes in the part of the conditional that is actually used, then those remain in effect after the conditional. So you could use ${1:+\U:\L}${2} to insert the text matched by the second capturing group in uppercase if the first group participated, and in lowercase if it didn’t.

R’s Backreference Case Conversion

The sub() and gsub() functions in R support case conversion escapes that are inspired by Perl strings. \U converts all backreferences up to the next \L or \E to uppercase. \L converts all backreferences up to the next \U or \E to lowercase. \E turns off case conversion.

When the regex (?i)(Helló) (Wórld) matches HeLlÓ WóRlD the replacement string \U$1 \L$2 becomes HELLÓ wórld. Literal text is not affected. \U$1 Dear $2 becomes HELLÓ Dear WÓRLD.