发表 admin at 2024年3月5日

类别

正则表达式

标签

关于正则表达式 » 正则表达式工具和实用程序 » 使用正则表达式与 Microsoft .NET

正则表达式工具

数据库

使用正则表达式与 Microsoft .NET

Microsoft .NET，您可以使用任何 .NET 编程语言，例如 C# (C sharp) 或 Visual Basic.NET，它对正则表达式有强大的支持。.NET 的正则表达式风格功能非常丰富。唯一值得注意的遗漏功能是所有格量词和子常式调用。

在 .NET Framework 版本 2.0 到 4.8 中支持的正则表达式风格没有差异。此风格与任何版本的 .NET Core 支持的风格也没有差异。这包括最初的 .NET Core 1.0.0 和最新的 .NET 5.0。

.NET Framework 1.x 中的正则表达式风格与后续版本相比有一些差异。.NET Framework 2.0 修复了一些错误。 Unicode 类别 \p{Pi} 和 \p{Pf} 不再反转。Unicode 区块名称中带有连字号的现在可以正确处理。.NET 2.0 中添加了一项功能：字符类别减法。它的运作方式与 XML Schema 正则表达式中完全相同。XML Schema 标准首先定义了此功能及其语法。

System.Text.RegularExpressions 总览 (使用 VB.NET 语法)

regex 类别位于命名空间 System.Text.RegularExpressions 中。若要让它们可用，请在原代码开头放置 Imports System.Text.RegularExpressions。

Regex 类别是您用来编译正则表达式的类别。为了效率，正则表达式会编译成内部格式。如果您计划重复使用相同的正则表达式，请如下建构 Regex 对象：Dim RegexObj as Regex = New Regex("regularexpression")。然后您可以调用 RegexObj.IsMatch("subject") 来检查正则表达式是否与主旨字符串相符。Regex 允许一个 RegexOptions 类型的第二个参数。您可以指定 RegexOptions.IgnoreCase 作为最后一个参数，以让正则表达式不区分大小写。其他选项包括 IgnorePatternWhitespace，它会让正则表达式自由间隔，RegexOptions.Singleline，它会让点符号与换行符号相符，RegexOptions.Multiline，它会让插入符号和美元符号与主旨字符串中的嵌入换行符号相符，以及 RegexOptions.ExplicitCapture，它会将所有未命名组转换为非捕获组。

调用 RegexObj.Replace("subject", "replacement") 以使用正则表达式对主旨字符串运行搜索并取代，将所有符合项取代为取代字符串。在取代字符串中，您可以使用 $& 将整个正则表达式符合项插入取代文本中。您可以使用 $1、$2、$3 等将截取括号之间符合的文本插入取代文本中。使用 $$ 将单一美元符号插入取代文本中。若要取代为第一个反向参考紧接在数字 9 之后，请使用 ${1}9。如果您输入 $19，且反向参考少于 19 个，则 $19 会被解释为文本，并以这种形式出现在结果字符串中。若要插入命名捕获组的文本，请使用 ${name}。不当使用 $ 符号可能会产生不理想的结果字符串，但绝不会导致引发例外状况。

RegexObj.Split("Subject") 会沿着正则表达式符合项拆分主旨字符串，传回字符串数组。数组包含正则表达式符合项之间的文本。如果正则表达式包含截取括号，则由它们符合的文本也会包含在数组中。如果您希望数组包含整个正则表达式符合项，请在创建 RegexObj 的实例时，在整个正则表达式周围加上括号。

Regex 类别也包含数个静态方法，让您可以在不创建 Regex 对象的状况下使用正则表达式。这会减少您必须撰写的代码量，而且如果只使用一次或很少重复使用相同的正则表达式，这会很合适。请注意，Regex 类别中大量使用成员重载。所有静态方法都与其他非静态方法具有相同的名称 (但参数清单不同)。

Regex.IsMatch("subject", "regex") 检查正则表达式是否与主旨字符串相符。 Regex.Replace("subject", "regex", "replacement") 运行搜索并取代。 Regex.Split("subject", "regex") 将主旨字符串分割成如上所述的字符串数组。所有这些方法都接受类型为 RegexOptions 的选用附加参数，就像构造函数一样。

System.Text.RegularExpressions.Match 类别

如果您想要更多关于正则表达式相符的信息，请调用 Regex.Match() 来建构 Match 对象。如果您已实例化 Regex 对象，请使用 Dim MatchObj as Match = RegexObj.Match("subject")。如果不是，请使用静态版本：Dim MatchObj as Match = Regex.Match("subject", "regex")。

无论哪种方式，您都将取得类别 Match 的一个对象，其中包含关于主旨字符串中第一个正则表达式相符的详细数据。 MatchObj.Success 指示实际上是否有相符。如果有的话，请使用 MatchObj.Value 取得相符的内容，MatchObj.Length 取得相符的长度，以及 MatchObj.Index 取得主旨字符串中相符的开头。相符的开头是以 0 为基础，因此它实际上计算主旨字符串中在相符左边的字符数。

如果正则表达式包含截取括号，请使用 MatchObj.Groups 集合。 MatchObj.Groups.Count 指示截取括号的数量。数量包含第 0 群组，也就是整个正则表达式相符。 MatchObj.Groups(3).Value 取得由第三对括号相符的文本。 MatchObj.Groups(3).Length 和 MatchObj.Groups(3).Index 取得由群组相符的文本长度，以及它在主旨字符串中的索引，相对于主旨字符串的开头。 MatchObj.Groups("name") 取得命名组 “name” 的详细数据。

若要在同一个主旨字符串中寻找正则表达式的下一个相符，请调用 MatchObj.NextMatch()，它会传回一个新的 Match 对象，其中包含第二次相符尝试的结果。您可以继续调用 MatchObj.NextMatch()，直到 MatchObj.Success 为 False。

请注意，在调用 RegexObj.Match() 之后，产生的 Match 对象会独立于 RegexObj。这表示您可以同时处理由同一个 Regex 对象创建的几个 Match 对象。

正则表达式、字面字符串和反斜线

在 C# 字面字符串中，以及在 C++ 和许多其他 .NET 语言中，反斜线是一个转义字符。字面字符串 "\\" 是单一反斜线。在正则表达式中，反斜线也是一个转义字符。正则表达式 \\ 相符单一反斜线。这个正则表达式作为 C# 字符串，会变成 "\\\\"。没错：4 个反斜线相符一个反斜线。

正则表达式 \w 符合一个字符。作为一个 C# 字符串，这会写成 "\\w"。

为了让你的代码更易于阅读，你应该使用 C# 原文本串。在原文本串中，反斜线是一个普通字符。这允许你用你会在正则表达式工具中写的方式，或用户会在你的应用程序中输入的方式，来在你的 C# 代码中撰写正则表达式。在使用 C# 原文本串时，符合反斜线的正则表达式写成 @"\\"。反斜线在正则表达式中仍然是一个转义字符，所以你仍然需要将它加倍。但是加倍比加四倍好。若要符合一个字符，请使用原文本串 @"\w"。

RegexOptions.ECMAScript

传递 RegexOptions.ECMAScript 给 Regex() 构造函数会改变特定正则表达式功能的行为，以遵循 ECMA-262 标准中规定的行为。此标准定义了 ECMAScript 语言，它更常称为 JavaScript。下表比较了正规 .NET（没有 ECMAScript 选项）和 ECMAScirpt 模式中的 .NET 之间的差异。为了参考，表格也比较了现代浏览器中的 JavaScript 在这些领域中的行为。

功能或语法	正规 .NET	.NET 在 ECMAScript 模式	JavaScript
RegexOptions.FreeSpacing	支持	仅通过 `(?x)`	不支持
RegexOptions.SingleLine	支持	仅通过 `(?s)`	不支持
RegexOptions.ExplicitCapture	支持	仅通过 `(?n)`	不支持
没有形成正则表达式标记的转义字母或底线	错误	文本字母或底线
不是有效反向引用的转义数字	错误	八进位转义或文本 8 或 9
没有形成有效反向引用的转义双位数字	错误	单一数字反向引用和文本数字，如果单一数字反向引用有效；否则单一或双位数字八进位转义和/或文本 8 和 9
反向引用到非参与群组	无法符合	零长度符合
前向参照	支持	错误	零长度符合
反向引用到群组 0	无法符合	零长度符合	语法上不可能
`\s`	Unicode	ASCII	Unicode
`\d`	Unicode	ASCII
`\w`	Unicode	ASCII
`\b`	Unicode	ASCII

尽管 RegexOptions.ECMAScript 让 .NET regex 引擎更接近 JavaScript，但 .NET regex 风格和 JavaScript regex 风格之间仍然存在显著差异。在服务器上使用 ASP.NET 和在用户端上使用 JavaScript 创建网页时，即使设置 RegexOptions.ECMAScript，也不能假设相同的 regex 在用户端和服务器端都能以相同的方式运作。下表列出了 .NET 和 JavaScript 之间较重要的差异。RegexOptions.ECMAScript 对于这些差异没有任何影响。

此表格也比较了 JavaScript 的 XRegExp 函数库。您可以使用这个函数库让 JavaScript 的 regex 风格更接近 .NET。

功能或语法	.NET	XRegExp	JavaScript
点	`[^\n]`	`[^\n\r\u2028\u2029]`
多行模式中的锚点	仅将 `\n` 视为换行符号	将 `\n`、`\r`、`\u2028` 和 `\u2029` 视为换行符号
`$`（不含多行模式）	与字符串最尾端相符	与最后换行符号之前和字符串最尾端相符
永久的字符串开头和结尾锚点	支持	不支持
空的字符类别	语法上不可能	无法符合
后向参照	不受限制地支持	自 ECMAScript 2018 起支持（不受限制）
模式修改器	任意位置	仅在 regex 开头	不支持
注解	支持		不支持
Unicode 属性	类别和区块		不支持
命名截取和反向引用	支持		不支持
平衡组	支持	不支持
条件	支持	不支持

關於正規表示式 » 正規表示式工具和實用程式 » 使用正規表示式與 Microsoft .NET

正規表示式工具

資料庫

本網站的更多資訊

使用正規表示式與 Microsoft .NET

Microsoft .NET，您可以使用任何 .NET 程式語言，例如 C# (C sharp) 或 Visual Basic.NET，它對正規表示式有強大的支援。.NET 的正規表示式風味功能非常豐富。唯一值得注意的遺漏功能是所有格量詞和子常式呼叫。

在 .NET Framework 版本 2.0 到 4.8 中支援的正規表示式風味沒有差異。此風味與任何版本的 .NET Core 支援的風味也沒有差異。這包括最初的 .NET Core 1.0.0 和最新的 .NET 5.0。

.NET Framework 1.x 中的正規表示式風味與後續版本相比有一些差異。.NET Framework 2.0 修復了一些錯誤。 Unicode 類別 \p{Pi} 和 \p{Pf} 不再反轉。Unicode 區塊名稱中帶有連字號的現在可以正確處理。.NET 2.0 中新增了一項功能：字元類別減法。它的運作方式與 XML Schema 正規表示式中完全相同。XML Schema 標準首先定義了此功能及其語法。

System.Text.RegularExpressions 總覽 (使用 VB.NET 語法)

regex 類別位於命名空間 System.Text.RegularExpressions 中。若要讓它們可用，請在原始碼開頭放置 Imports System.Text.RegularExpressions。

Regex 類別是您用來編譯正規表示式的類別。為了效率，正規表示式會編譯成內部格式。如果您計畫重複使用相同的正規表示式，請如下建構 Regex 物件：Dim RegexObj as Regex = New Regex("regularexpression")。然後您可以呼叫 RegexObj.IsMatch("subject") 來檢查正規表示式是否與主旨字串相符。Regex 允許一個 RegexOptions 類型的第二個參數。您可以指定 RegexOptions.IgnoreCase 作為最後一個參數，以讓正規表示式不區分大小寫。其他選項包括 IgnorePatternWhitespace，它會讓正規表示式自由間隔，RegexOptions.Singleline，它會讓點符號與換行符號相符，RegexOptions.Multiline，它會讓插入符號和美元符號與主旨字串中的嵌入換行符號相符，以及 RegexOptions.ExplicitCapture，它會將所有未命名群組轉換為非擷取群組。

呼叫 RegexObj.Replace("subject", "replacement") 以使用正規表示式對主旨字串執行搜尋並取代，將所有符合項取代為取代字串。在取代字串中，您可以使用 $& 將整個正規表示式符合項插入取代文字中。您可以使用 $1、$2、$3 等將擷取括號之間符合的文字插入取代文字中。使用 $$ 將單一美元符號插入取代文字中。若要取代為第一個反向參考緊接在數字 9 之後，請使用 ${1}9。如果您輸入 $19，且反向參考少於 19 個，則 $19 會被解釋為文字，並以這種形式出現在結果字串中。若要插入命名擷取群組的文字，請使用 ${name}。不當使用 $ 符號可能會產生不理想的結果字串，但絕不會導致引發例外狀況。

RegexObj.Split("Subject") 會沿著正規表示式符合項拆分主旨字串，傳回字串陣列。陣列包含正規表示式符合項之間的文字。如果正規表示式包含擷取括號，則由它們符合的文字也會包含在陣列中。如果您希望陣列包含整個正規表示式符合項，請在建立 RegexObj 的實例時，在整個正規表示式周圍加上括號。

Regex 類別也包含數個靜態方法，讓您可以在不建立 Regex 物件的狀況下使用正規表示式。這會減少您必須撰寫的程式碼量，而且如果只使用一次或很少重複使用相同的正規表示式，這會很合適。請注意，Regex 類別中大量使用成員重載。所有靜態方法都與其他非靜態方法具有相同的名稱 (但參數清單不同)。

Regex.IsMatch("subject", "regex") 檢查正規表示式是否與主旨字串相符。 Regex.Replace("subject", "regex", "replacement") 執行搜尋並取代。 Regex.Split("subject", "regex") 將主旨字串分割成如上所述的字串陣列。所有這些方法都接受類型為 RegexOptions 的選用附加參數，就像建構函式一樣。

System.Text.RegularExpressions.Match 類別

如果您想要更多關於正規表示式相符的資訊，請呼叫 Regex.Match() 來建構 Match 物件。如果您已實例化 Regex 物件，請使用 Dim MatchObj as Match = RegexObj.Match("subject")。如果不是，請使用靜態版本：Dim MatchObj as Match = Regex.Match("subject", "regex")。

無論哪種方式，您都將取得類別 Match 的一個物件，其中包含關於主旨字串中第一個正規表示式相符的詳細資料。 MatchObj.Success 指示實際上是否有相符。如果有的話，請使用 MatchObj.Value 取得相符的內容，MatchObj.Length 取得相符的長度，以及 MatchObj.Index 取得主旨字串中相符的開頭。相符的開頭是以 0 為基礎，因此它實際上計算主旨字串中在相符左邊的字元數。

如果正規表示式包含擷取括號，請使用 MatchObj.Groups 集合。 MatchObj.Groups.Count 指示擷取括號的數量。數量包含第 0 群組，也就是整個正規表示式相符。 MatchObj.Groups(3).Value 取得由第三對括號相符的文字。 MatchObj.Groups(3).Length 和 MatchObj.Groups(3).Index 取得由群組相符的文字長度，以及它在主旨字串中的索引，相對於主旨字串的開頭。 MatchObj.Groups("name") 取得命名群組 “name” 的詳細資料。

若要在同一個主旨字串中尋找正規表示式的下一個相符，請呼叫 MatchObj.NextMatch()，它會傳回一個新的 Match 物件，其中包含第二次相符嘗試的結果。您可以繼續呼叫 MatchObj.NextMatch()，直到 MatchObj.Success 為 False。

請注意，在呼叫 RegexObj.Match() 之後，產生的 Match 物件會獨立於 RegexObj。這表示您可以同時處理由同一個 Regex 物件建立的幾個 Match 物件。

正規表示式、字面字串和反斜線

在 C# 字面字串中，以及在 C++ 和許多其他 .NET 語言中，反斜線是一個跳脫字元。字面字串 "\\" 是單一反斜線。在正規表示式中，反斜線也是一個跳脫字元。正規表示式 \\ 相符單一反斜線。這個正規表示式作為 C# 字串，會變成 "\\\\"。沒錯：4 個反斜線相符一個反斜線。

正規表示式 \w 符合一個字元。作為一個 C# 字串，這會寫成 "\\w"。

為了讓你的程式碼更易於閱讀，你應該使用 C# 原文字串。在原文字串中，反斜線是一個普通字元。這允許你用你會在正規表示式工具中寫的方式，或使用者會在你的應用程式中輸入的方式，來在你的 C# 程式碼中撰寫正規表示式。在使用 C# 原文字串時，符合反斜線的正規表示式寫成 @"\\"。反斜線在正規表示式中仍然是一個跳脫字元，所以你仍然需要將它加倍。但是加倍比加四倍好。若要符合一個字元，請使用原文字串 @"\w"。

RegexOptions.ECMAScript

傳遞 RegexOptions.ECMAScript 給 Regex() 建構函式會改變特定正規表示式功能的行為，以遵循 ECMA-262 標準中規定的行為。此標準定義了 ECMAScript 語言，它更常稱為 JavaScript。下表比較了正規 .NET（沒有 ECMAScript 選項）和 ECMAScirpt 模式中的 .NET 之間的差異。為了參考，表格也比較了現代瀏覽器中的 JavaScript 在這些領域中的行為。

功能或語法	正規 .NET	.NET 在 ECMAScript 模式	JavaScript
RegexOptions.FreeSpacing	支援	僅透過 `(?x)`	不支援
RegexOptions.SingleLine	支援	僅透過 `(?s)`	不支援
RegexOptions.ExplicitCapture	支援	僅透過 `(?n)`	不支援
沒有形成正規表示式標記的跳脫字母或底線	錯誤	文字字母或底線
不是有效反向參照的跳脫數字	錯誤	八進位跳脫或文字 8 或 9
沒有形成有效反向參照的跳脫雙位數字	錯誤	單一數字反向參照和文字數字，如果單一數字反向參照有效；否則單一或雙位數字八進位跳脫和/或文字 8 和 9
反向參照到非參與群組	無法符合	零長度符合
前向參照	支援	錯誤	零長度符合
反向參照到群組 0	無法符合	零長度符合	語法上不可能
`\s`	Unicode	ASCII	Unicode
`\d`	Unicode	ASCII
`\w`	Unicode	ASCII
`\b`	Unicode	ASCII

儘管 RegexOptions.ECMAScript 讓 .NET regex 引擎更接近 JavaScript，但 .NET regex 風格和 JavaScript regex 風格之間仍然存在顯著差異。在伺服器上使用 ASP.NET 和在用戶端上使用 JavaScript 建立網頁時，即使設定 RegexOptions.ECMAScript，也不能假設相同的 regex 在用戶端和伺服器端都能以相同的方式運作。下表列出了 .NET 和 JavaScript 之間較重要的差異。RegexOptions.ECMAScript 對於這些差異沒有任何影響。

此表格也比較了 JavaScript 的 XRegExp 函式庫。您可以使用這個函式庫讓 JavaScript 的 regex 風格更接近 .NET。

功能或語法	.NET	XRegExp	JavaScript
點	`[^\n]`	`[^\n\r\u2028\u2029]`
多行模式中的錨點	僅將 `\n` 視為換行符號	將 `\n`、`\r`、`\u2028` 和 `\u2029` 視為換行符號
`$`（不含多行模式）	與字串最尾端相符	與最後換行符號之前和字串最尾端相符
永久的字串開頭和結尾錨點	支援	不支援
空的字元類別	語法上不可能	無法符合
後向參照	不受限制地支援	自 ECMAScript 2018 起支援（不受限制）
模式修改器	任意位置	僅在 regex 開頭	不支援
註解	支援		不支援
Unicode 屬性	類別和區塊		不支援
命名擷取和反向參照	支援		不支援
平衡群組	支援	不支援
條件	支援	不支援

About Regular Expressions » Tools and Utilities for Regular Expressions » Using Regular Expressions with Microsoft .NET

Regex Tools

grep

Languages & Libraries

Databases

Using Regular Expressions with Microsoft .NET

Microsoft .NET, which you can use with any .NET programming language such as C# (C sharp) or Visual Basic.NET, has solid support for regular expressions. .NET’s regex flavor is very feature-rich. The only noteworthy features that are lacking are possessive quantifiers and subroutine calls.

There are no differences in the regex flavor supported by the .NET Framework versions 2.0 through 4.8. There are no differences between this flavor and the flavor supported by any version of .NET Core either. That includes the original .NET Core 1.0.0 and the latest .NET 5.0.

There are a few differences between the regex flavor in the .NET Framework 1.x compared with later versions. The .NET Framework 2.0 fixes a few bugs. The Unicode categories \p{Pi} and \p{Pf} are no longer reversed. Unicode blocks with hyphens in their names are now handled correctly. One feature was added in .NET 2.0: character class subtraction. It works exactly the way it does in XML Schema regular expressions. The XML Schema standard first defined this feature and its syntax.

System.Text.RegularExpressions Overview (Using VB.NET Syntax)

The regex classes are located in the namespace System.Text.RegularExpressions. To make them available, place Imports System.Text.RegularExpressions at the start of your source code.

The Regex class is the one you use to compile a regular expression. For efficiency, regular expressions are compiled into an internal format. If you plan to use the same regular expression repeatedly, construct a Regex object as follows: Dim RegexObj as Regex = New Regex("regularexpression"). You can then call RegexObj.IsMatch("subject") to check whether the regular expression matches the subject string. The Regex allows an optional second parameter of type RegexOptions. You could specify RegexOptions.IgnoreCase as the final parameter to make the regex case insensitive. Other options are IgnorePatternWhitespace which makes the regex free-spacing, RegexOptions.Singleline which makes the dot to match newlines, RegexOptions.Multiline which makes the caret and dollar to match at embedded newlines in the subject string, and RegexOptions.ExplicitCapture which turns all unnamed groups into non-capturing groups.

Call RegexObj.Replace("subject", "replacement") to perform a search-and-replace using the regex on the subject string, replacing all matches with the replacement string. In the replacement string, you can use $& to insert the entire regex match into the replacement text. You can use $1, $2, $3, etc. to insert the text matched between capturing parentheses into the replacement text. Use $$ to insert a single dollar sign into the replacement text. To replace with the first backreference immediately followed by the digit 9, use ${1}9. If you type $19, and there are less than 19 backreferences, then $19 will be interpreted as literal text, and appear in the result string as such. To insert the text from a named capturing group, use ${name}. Improper use of the $ sign may produce an undesirable result string, but will never cause an exception to be raised.

RegexObj.Split("Subject") splits the subject string along regex matches, returning an array of strings. The array contains the text between the regex matches. If the regex contains capturing parentheses, the text matched by them is also included in the array. If you want the entire regex matches to be included in the array, simply place parentheses around the entire regular expression when instantiating RegexObj.

The Regex class also contains several static methods that allow you to use regular expressions without instantiating a Regex object. This reduces the amount of code you have to write, and is appropriate if the same regular expression is used only once or reused seldomly. Note that member overloading is used a lot in the Regex class. All the static methods have the same names (but different parameter lists) as other non-static methods.

Regex.IsMatch("subject", "regex") checks if the regular expression matches the subject string. Regex.Replace("subject", "regex", "replacement") performs a search-and-replace. Regex.Split("subject", "regex") splits the subject string into an array of strings as described above. All these methods accept an optional additional parameter of type RegexOptions, like the constructor.

The System.Text.RegularExpressions.Match Class

If you want more information about the regex match, call Regex.Match() to construct a Match object. If you instantiated a Regex object, use Dim MatchObj as Match = RegexObj.Match("subject"). If not, use the static version: Dim MatchObj as Match = Regex.Match("subject", "regex").

Either way, you will get an object of class Match that holds the details about the first regex match in the subject string. MatchObj.Success indicates if there actually was a match. If so, use MatchObj.Value to get the contents of the match, MatchObj.Length for the length of the match, and MatchObj.Index for the start of the match in the subject string. The start of the match is zero-based, so it effectively counts the number of characters in the subject string to the left of the match.

If the regular expression contains capturing parentheses, use the MatchObj.Groups collection. MatchObj.Groups.Count indicates the number of capturing parentheses. The count includes the zeroth group, which is the entire regex match. MatchObj.Groups(3).Value gets the text matched by the third pair of parentheses. MatchObj.Groups(3).Length and MatchObj.Groups(3).Index get the length of the text matched by the group and its index in the subject string, relative to the start of the subject string. MatchObj.Groups("name") gets the details of the named group “name”.

To find the next match of the regular expression in the same subject string, call MatchObj.NextMatch() which returns a new Match object containing the results for the second match attempt. You can continue calling MatchObj.NextMatch() until MatchObj.Success is False.

Note that after calling RegexObj.Match(), the resulting Match object is independent from RegexObj. This means you can work with several Match objects created by the same Regex object simultaneously.

Regular Expressions, Literal Strings and Backslashes

In literal C# strings, as well as in C++ and many other .NET languages, the backslash is an escape character. The literal string "\\" is a single backslash. In regular expressions, the backslash is also an escape character. The regular expression \\ matches a single backslash. This regular expression as a C# string, becomes "\\\\". That’s right: 4 backslashes to match a single one.

The regex \w matches a word character. As a C# string, this is written as "\\w".

To make your code more readable, you should use C# verbatim strings. In a verbatim string, a backslash is an ordinary character. This allows you to write the regular expression in your C# code as you would write it a tool, or as the user would type it into your application. The regex to match a backlash is written as @"\\" when using C# verbatim strings. The backslash is still an escape character in the regular expression, so you still need to double it. But doubling is better than quadrupling. To match a word character, use the verbatim string @"\w".

RegexOptions.ECMAScript

Passing RegexOptions.ECMAScript to the Regex() constructor changes the behavior of certain regex features to follow the behavior prescribed in the ECMA-262 standard. This standard defines the ECMAScript language, which is better known as JavaScript. The table below compares the differences between canonical .NET (without the ECMAScript option) and .NET in ECMAScirpt mode. For reference the table also compares how JavaScript in modern browsers behaves in these areas.

Feature or Syntax	Canonical .NET	.NET in ECMAScript mode	JavaScript
RegexOptions.FreeSpacing	Supported	Only via `(?x)`	Not supported
RegexOptions.SingleLine	Supported	Only via `(?s)`	Not supported
RegexOptions.ExplicitCapture	Supported	Only via `(?n)`	Not supported
Escaped letter or underscore that does not form a regex token	Error	Literal letter or underscore
Escaped digit that is not a valid backreference	Error	Octal escape or literal 8 or 9
Escaped double digits that do not form a valid backreference	Error	Single digit backreference and literal digit if the single digit backreference is valid; otherwise single or double digit octal escape and/or literal 8 and 9
Backreference to non-participating group	Fails to match	Zero-length match
Forward reference	Supported	Error	Zero-length match
Backreference to group 0	Fails to match	Zero-length match	Syntactically not possible
`\s`	Unicode	ASCII	Unicode
`\d`	Unicode	ASCII
`\w`	Unicode	ASCII
`\b`	Unicode	ASCII

Though RegexOptions.ECMAScript brings the .NET regex engine a little bit closer to JavaScript’s, there are still significant differences between the .NET regex flavor and the JavaScript regex flavor. When creating web pages using ASP.NET on the server an JavaScript on the client, you cannot assume the same regex to work in the same way both on the client side and the server side even when setting RegexOptions.ECMAScript. The next table lists the more important differences between .NET and JavaScript. RegexOptions.ECMAScript has no impact on any of these.

The table also compares the XRegExp library for JavaScript. You can use this library to bring JavaScript’s regex flavor a little bit closer to .NET’s.

Feature or syntax	.NET	XRegExp	JavaScript
Dot	`[^\n]`	`[^\n\r\u2028\u2029]`
Anchors in multi-line mode	Treat only `\n` as a line break	Treat `\n`, `\r`, `\u2028`, and `\u2029` as line breaks
`$` without multi-line mode	Matches at very end of string	Matches before final line break and at very end of string
Permanent start and end of string anchors	Supported	Not supported
Empty character class	Syntactically not possible	Fails to match
Lookbehind	Supported without restrictions	Supported (without restrictions) since ECMAScript 2018
Mode modifiers	Anywhere	At start of regex only	Not supported
Comments	Supported		Not supported
Unicode properties	Categories and blocks		Not supported
Named capture and backreferences	Supported		Not supported
Balancing groups	Supported	Not supported
Conditionals	Supported	Not supported