Perl 最初由 Larry Wall 设计为一种弹性的文本处理语言。多年来，它已发展成一种功能齐全的编程语言，并持续专注于文本处理。当万维网普及时，Perl 成为创建 CGI 脚本的事实标准。CGI 脚本是一小段软件，可根据数据库和/或造访网站的人员输入，产生动态网页。由于 CGI 脚本基本上是一种文本处理脚本，因此 Perl 是而且仍然是自然选择。

由于 Perl 专注于管理和处理文本，正则表达式文本样式是 Perl 语言中不可或缺的一部分。这与大多数其他语言形成对比，在这些语言中，正则表达式可用作附加函数库。在 Perl 中，您可以使用 m// 操作符来测试 regex 是否可以比对字符串，例如：

if ($string =~ m/regex/) {
  print 'match';
} else {
  print 'no match';
}

运行 regex 搜索和取代也一样容易

$string =~ s/regex/replacement/g;

我在最后一个正斜线后加上一个「g」。「g」代表「全域」，告诉 Perl 取代所有符合的项目，而不仅仅是第一个。选项通常会包含正斜线，例如「/g」，即使你没有添加额外的正斜线，而且你也可以使用任何非字符来取代正斜线。如果你的 regex 包含正斜线，请使用其他字符，例如 s!regex!replacement!g。

你可以添加一个「i」让 regex 比对不区分大小写。你可以添加一个「s」让点符合换行符号。你可以添加一个「m」让美元符号和插入符号符合字符串中内嵌的换行符号，以及字符串的开头和结尾。

将这些条件组合起来，你会得到类似 m/regex/sim; 的结果

与 Regex 相关的特殊变量

Perl 有一组特殊变量，会在每次 m// 或 s/// regex 比对后填入数据。 $1、$2、$3 等会保存反向引用。 $+ 会保存最后一个（编号最高的）反向引用。 $&（美元符号加上＆符号）会保存整个 regex 比对结果。

@- 是字符串中比对开始索引的数组。 $-[0] 保存整个 regex 比对的开头，$-[1] 保存第一个反向引用的开头，依此类推。同样地，@+ 会保存比对结束位置。若要取得比对长度，请从 $+[0] 减去 $-[0]。

在 Perl 5.10 及更新版本中，你可以使用关联式数组 %+ 来取得命名捕获组比对的文本。例如，$+{name} 会保存群组「name」比对的文本。Perl 没有提供通过参照群组名称取得捕获组比对位置的方法。由于命名组也有编号，你可以对命名组使用 @- 和 @+，但你必须自行找出群组的编号。

$'（美元符号加上撇号或单引号）会保存 regex 比对后（右侧）的字符串部分。 $`（美元符号加上反引号）会保存 regex 比对前（左侧）的字符串部分。不建议在性能很重要的脚本中使用这些变量，因为它会让 Perl 放慢整个脚本中所有 regex 比对的速度。

所有这些变量都是唯读的，并会持续到下一次 regex 比对尝试为止。它们是动态范围的，就好像在封闭范围的开头有一个隐含的「local」一样。因此，如果你运行 regex 比对，然后调用运行 regex 比对的子程序，当该子程序传回时，你的变量仍会设置为第一次比对的状态。

在字符串中寻找所有符合的项目

「/g」修饰词可用于处理字符串中的所有正则表达式比对。第一个 m/regex/g 会找出第一个比对，第二个 m/regex/g 会找出第二个比对，依此类推。Perl 会自动记住字符串中下一次比对尝试的开始位置，而且会针对每个字符串分别记住。以下是一个范例

while ($string =~ m/regex/g) {
  print "Found '$&'.  Next attempt at character " . pos($string)+1 . "\n";
}

pos() 函数会截取下一次尝试开始的位置。字符串中的第一个字符位置为零。你可以使用函数作为指定函数的左侧，来修改这个位置，就像 pos($string) = 123;。

關於正規表示式 » 正規表示式工具和實用程式 » Perl 對正規表示式的豐富支援

Regex 工具

資料庫

本網站更多資訊

Perl 對正規表示式的豐富支援

Perl 最初由 Larry Wall 設計為一種彈性的文字處理語言。多年來，它已發展成一種功能齊全的程式語言，並持續專注於文字處理。當全球資訊網普及時，Perl 成為建立 CGI 腳本的事實標準。CGI 腳本是一小段軟體，可根據資料庫和/或造訪網站的人員輸入，產生動態網頁。由於 CGI 腳本基本上是一種文字處理腳本，因此 Perl 是而且仍然是自然選擇。

由於 Perl 專注於管理和處理文字，正規表示式文字樣式是 Perl 語言中不可或缺的一部分。這與大多數其他語言形成對比，在這些語言中，正規表示式可用作附加函式庫。在 Perl 中，您可以使用 m// 運算子來測試 regex 是否可以比對字串，例如：

if ($string =~ m/regex/) {
  print 'match';
} else {
  print 'no match';
}

執行 regex 搜尋和取代也一樣容易

$string =~ s/regex/replacement/g;

我在最後一個正斜線後加上一個「g」。「g」代表「全域」，告訴 Perl 取代所有符合的項目，而不仅仅是第一個。選項通常會包含正斜線，例如「/g」，即使你沒有新增額外的正斜線，而且你也可以使用任何非字元來取代正斜線。如果你的 regex 包含正斜線，請使用其他字元，例如 s!regex!replacement!g。

你可以新增一個「i」讓 regex 比對不區分大小寫。你可以新增一個「s」讓點符合換行符號。你可以新增一個「m」讓美元符號和插入符號符合字串中內嵌的換行符號，以及字串的開頭和結尾。

將這些條件組合起來，你會得到類似 m/regex/sim; 的結果

與 Regex 相關的特殊變數

Perl 有一組特殊變數，會在每次 m// 或 s/// regex 比對後填入資料。 $1、$2、$3 等會儲存反向參照。 $+ 會儲存最後一個（編號最高的）反向參照。 $&（美元符號加上＆符號）會儲存整個 regex 比對結果。

@- 是字串中比對開始索引的陣列。 $-[0] 儲存整個 regex 比對的開頭，$-[1] 儲存第一個反向參照的開頭，依此類推。同樣地，@+ 會儲存比對結束位置。若要取得比對長度，請從 $+[0] 減去 $-[0]。

在 Perl 5.10 及更新版本中，你可以使用關聯式陣列 %+ 來取得命名擷取群組比對的文字。例如，$+{name} 會儲存群組「name」比對的文字。Perl 沒有提供透過參照群組名稱取得擷取群組比對位置的方法。由於命名群組也有編號，你可以對命名群組使用 @- 和 @+，但你必須自行找出群組的編號。

$'（美元符號加上撇號或單引號）會儲存 regex 比對後（右側）的字串部分。 $`（美元符號加上反引號）會儲存 regex 比對前（左側）的字串部分。不建議在效能很重要的腳本中使用這些變數，因為它會讓 Perl 放慢整個腳本中所有 regex 比對的速度。

所有這些變數都是唯讀的，並會持續到下一次 regex 比對嘗試為止。它們是動態範圍的，就好像在封閉範圍的開頭有一個隱含的「local」一樣。因此，如果你執行 regex 比對，然後呼叫執行 regex 比對的子程式，當該子程式傳回時，你的變數仍會設定為第一次比對的狀態。

在字串中尋找所有符合的項目

「/g」修飾詞可用於處理字串中的所有正規表示式比對。第一個 m/regex/g 會找出第一個比對，第二個 m/regex/g 會找出第二個比對，依此類推。Perl 會自動記住字串中下一次比對嘗試的開始位置，而且會針對每個字串分別記住。以下是一個範例

while ($string =~ m/regex/g) {
  print "Found '$&'.  Next attempt at character " . pos($string)+1 . "\n";
}

pos() 函數會擷取下一次嘗試開始的位置。字串中的第一個字元位置為零。你可以使用函數作為指定函數的左側，來修改這個位置，就像 pos($string) = 123;。

About Regular Expressions » Tools and Utilities for Regular Expressions » Perl’s Rich Support for Regular Expressions

Regex Tools

grep

Languages & Libraries

Databases

Perl’s Rich Support for Regular Expressions

Perl was originally designed by Larry Wall as a flexible text-processing language. Over the years, it has grown into a full-fledged programming language, keeping a strong focus on text processing. When the world wide web became popular, Perl became the de facto standard for creating CGI scripts. A CGI script is a small piece of software that generates a dynamic web page, based on a database and/or input from the person visiting the website. Since CGI script basically is a text-processing script, Perl was and still is a natural choice.

Because of Perl’s focus on managing and mangling text, regular expression text patterns are an integral part of the Perl language. This in contrast with most other languages, where regular expressions are available as add-on libraries. In Perl, you can use the m// operator to test if a regex can match a string, e.g.:

if ($string =~ m/regex/) {
  print 'match';
} else {
  print 'no match';
}

Performing a regex search-and-replace is just as easy:

$string =~ s/regex/replacement/g;

I added a “g” after the last forward slash. The “g” stands for “global”, which tells Perl to replace all matches, and not just the first one. Options are typically indicated including the slash, like “/g”, even though you do not add an extra slash, and even though you could use any non-word character instead of slashes. If your regex contains slashes, use another character, like s!regex!replacement!g.

You can add an “i” to make the regex match case insensitive. You can add an “s” to make the dot match newlines. You can add an “m” to make the dollar and caret match at newlines embedded in the string, as well as at the start and end of the string.

Together you would get something like m/regex/sim;

Regex-Related Special Variables

Perl has a host of special variables that get filled after every m// or s/// regex match. $1, $2, $3, etc. hold the backreferences. $+ holds the last (highest-numbered) backreference. $& (dollar ampersand) holds the entire regex match.

@- is an array of match-start indices into the string. $-[0] holds the start of the entire regex match, $-[1] the start of the first backreference, etc. Likewise, @+ holds match-end positions. To get the length of the match, subtract $+[0] from $-[0].

In Perl 5.10 and later you can use the associative array %+ to get the text matched by named capturing groups. For example, $+{name} holds the text matched by the group “name”. Perl does not provide a way to get match positions of capturing groups by referencing their names. Since named groups are also numbered, you can use @- and @+ for named groups, but you have to figure out the group’s number by yourself.

$' (dollar followed by an apostrophe or single quote) holds the part of the string after (to the right of) the regex match. $` (dollar backtick) holds the part of the string before (to the left of) the regex match. Using these variables is not recommended in scripts when performance matters, as it causes Perl to slow down all regex matches in your entire script.

All these variables are read-only, and persist until the next regex match is attempted. They are dynamically scoped, as if they had an implicit ‘local’ at the start of the enclosing scope. Thus if you do a regex match, and call a sub that does a regex match, when that sub returns, your variables are still set as they were for the first match.

Finding All Matches In a String

The “/g” modifier can be used to process all regex matches in a string. The first m/regex/g will find the first match, the second m/regex/g the second match, etc. The location in the string where the next match attempt will begin is automatically remembered by Perl, separately for each string. Here is an example:

while ($string =~ m/regex/g) {
  print "Found '$&'.  Next attempt at character " . pos($string)+1 . "\n";
}

The pos() function retrieves the position where the next attempt begins. The first character in the string has position zero. You can modify this position by using the function as the left side of an assignment, like in pos($string) = 123;.