如何在C#regex中使用内联修饰符?

时间:2022-02-12 05:28:38

How do I use the inline modifiers instead of RegexOptions.Option?

如何使用内联修饰符而不是RegexOptions.Option?

For example:

例如:

Regex MyRegex = new Regex(@"[a-z]+", RegexOptions.IgnoreCase);

How do I rewrite this using the inline character i?

如何使用内联字符i重写它?

http://msdn.microsoft.com/en-us/library/yd1hzczs.aspx

http://msdn.microsoft.com/en-us/library/yd1hzczs.aspx

2 个解决方案

#1


31  

You can use inline modifiers as follows:

您可以使用内联修饰符,如下所示:

// case insensitive match
Regex MyRegex = new Regex(@"(?i)[a-z]+");  // case insensitive match

or, inverse the meaning of the modifier by adding a minus-sign:

或者,通过添加减号来反转修饰符的含义:

// case sensitive match
Regex MyRegex = new Regex(@"(?-i)[a-z]+");  // case sensitive match

or, switch them on and off:

或者,打开和关闭它们:

// case sensitive, then case-insensitive match
Regex MyRegex = new Regex(@"(?-i)[a-z]+(?i)[k-n]+");

Alternatively, you can use the mode-modifier span syntax using a colon : and a grouping parenthesis, which scopes the modifier to only that group:

或者,您可以使用冒号:和分组括号使用mode-modifier span语法,该修饰符将修饰符仅限定为该组:

// case sensitive, then case-insensitive match
Regex MyRegex = new Regex(@"(?-i:[a-z]+)(?i:[k-n]+)");

You can use multiple modifiers in one go like this (?is-m:text), or after another, if you find that clearer (?i)(?s)(?-m)text (I don't). When you use the on/off switching syntax, be aware that the modifier works till the next switch, or the end of the regex. Conversely, using the mode-modified spans, after the span the default behavior will apply.

如果你发现更清晰(?i)(?s)(? - m)文本(我没有),你可以像这样使用多个修饰符(?is-m:text),或者在另一个之后使用多个修饰符。当您使用开/关切换语法时,请注意修饰符一直有效,直到下一个开关或正则表达式结束。相反,使用模式修改的跨度,跨度后将应用默认行为。

Finally: the allowed modifiers in .NET are (use a minus to invert the mode):

最后:.NET中允许的修饰符是(使用减号来反转模式):

x allow whitespace and comments
s single-line mode
m multi-line mode
i case insensitivity
n only allow explicit capture (.NET specific)

x允许空白和注释单行模式多行模式i不区分大小写n只允许显式捕获(特定于.NET)

#2


7  

Use it in this manner:

以这种方式使用它:

Regex MyRegex = new Regex(@"(?i:[a-z]+)");

Prefix the inline option to your pattern with (?<option>:<pattern>). In this case the option is "i" for IgnoreCase.

使用(?

By specifying a colon above you are setting the option to just that pattern. To make the option apply to the entire pattern you may set it in the beginning on its own:

通过在上面指定冒号,您可以将选项设置为该模式。要使该选项适用于整个模式,您可以在开头单独设置它:

@"(?i)[a-z]+"

It is also possible to use multiple options and turn them on and off:

也可以使用多个选项并打开和关闭它们:

// On: IgnoreCase, ExplicitCapture. Off: IgnorePatternWhitespace
@"(?in-x)[a-z]+"

This allows for flexibility in a pattern to enable/disable options at different points of a regex that isn't possible when using the RegexOptions on the entire pattern.

这允许模式中的灵活性在正则表达式的不同点处启用/禁用选项,这在整个模式上使用RegexOptions时是不可能的。

Here is a slightly in-depth example. I encourage you to play with it to understand when the options are taking effect.

这是一个稍微深入的例子。我鼓励您使用它来了解选项何时生效。

string input = "H2O (water) is named Dihydrogen Monoxide or Hydrogen Hydroxide. The H represents a hydrogen atom, and O is an Oxide atom.";

// n = explicit captures
// x = ignore pattern whitespace
// -i = remove ignorecase option
string pattern = @"di?(?nx-i) ( hydrogen ) | oxide";
var matches = Regex.Matches(input, pattern, RegexOptions.IgnoreCase);
Console.WriteLine("Total Matches: " + matches.Count);
foreach (Match match in matches)
{
    Console.WriteLine("Match: {0} - Groups: {1}", match.Value, match.Groups[1].Captures.Count);
}

Console.WriteLine();

// n = explicit captures
// x = ignore pattern whitespace
// -i = remove ignorecase option
// -x = remove ignore pattern whitespace
pattern = @"di?(?nx-i) (?<H> hydrogen ) (?-x)|oxide";
matches = Regex.Matches(input, pattern, RegexOptions.IgnoreCase);
Console.WriteLine("Total Matches: " + matches.Count);
foreach (Match match in matches)
{
    Console.WriteLine("Match: {0} - Groups: {1}", match.Value, match.Groups["H"].Captures.Count);
}

The output for the above is:

上面的输出是:

Total Matches: 3
Match: Dihydrogen - Groups: 0
Match: oxide - Groups: 0
Match: oxide - Groups: 0

Total Matches: 3
Match: Dihydrogen - Groups: 1
Match: oxide - Groups: 0
Match: oxide - Groups: 0

In both patterns RegexOptions.IgnoreCase is used which allows "di" to be case insensitive and thus match "Dihydrogen" (capital D). Since explicit capturing is on, the first example fails to have any groups for ( hydrogen ) since it doesn't use a named group, which is the requirement for explicit capturing. The second pattern does have 1 group since it uses (?<H> hydrogen ).

在两种模式中都使用RegexOptions.IgnoreCase,它允许“di”不区分大小写,因此匹配“Dihydrogen”(大写字母D)。由于显式捕获已启用,因此第一个示例无法为(氢)设置任何组,因为它不使用命名组,这是显式捕获的要求。第二种模式确实有1组,因为它使用(? 氢)。

Next, notice that the second pattern is modified to use (?-x)|oxide at the end. Since IgnorePatternWhitespace is disabled after the hydrogen capture, the remainder of the pattern must be correctly formed by not having additional whitespace (compare with the first pattern) until (?x) is turned on later in the pattern. This serves no real purpose but just shows an in-depth usage of inline options to demonstrate when they actually kick in.

接下来,注意第二个图案被修改为最后使用(?-x)|氧化物。由于在氢捕获后禁用了IgnorePatternWhitespace,因此必须通过不使用额外的空白(与第一个模式相比)正确形成模式的其余部分,直到(?x)在模式中稍后打开。这没有任何实际意义,但只是展示了内联选项的深入使用,以展示它们实际启动的时间。

#1


31  

You can use inline modifiers as follows:

您可以使用内联修饰符,如下所示:

// case insensitive match
Regex MyRegex = new Regex(@"(?i)[a-z]+");  // case insensitive match

or, inverse the meaning of the modifier by adding a minus-sign:

或者,通过添加减号来反转修饰符的含义:

// case sensitive match
Regex MyRegex = new Regex(@"(?-i)[a-z]+");  // case sensitive match

or, switch them on and off:

或者,打开和关闭它们:

// case sensitive, then case-insensitive match
Regex MyRegex = new Regex(@"(?-i)[a-z]+(?i)[k-n]+");

Alternatively, you can use the mode-modifier span syntax using a colon : and a grouping parenthesis, which scopes the modifier to only that group:

或者,您可以使用冒号:和分组括号使用mode-modifier span语法,该修饰符将修饰符仅限定为该组:

// case sensitive, then case-insensitive match
Regex MyRegex = new Regex(@"(?-i:[a-z]+)(?i:[k-n]+)");

You can use multiple modifiers in one go like this (?is-m:text), or after another, if you find that clearer (?i)(?s)(?-m)text (I don't). When you use the on/off switching syntax, be aware that the modifier works till the next switch, or the end of the regex. Conversely, using the mode-modified spans, after the span the default behavior will apply.

如果你发现更清晰(?i)(?s)(? - m)文本(我没有),你可以像这样使用多个修饰符(?is-m:text),或者在另一个之后使用多个修饰符。当您使用开/关切换语法时,请注意修饰符一直有效,直到下一个开关或正则表达式结束。相反,使用模式修改的跨度,跨度后将应用默认行为。

Finally: the allowed modifiers in .NET are (use a minus to invert the mode):

最后:.NET中允许的修饰符是(使用减号来反转模式):

x allow whitespace and comments
s single-line mode
m multi-line mode
i case insensitivity
n only allow explicit capture (.NET specific)

x允许空白和注释单行模式多行模式i不区分大小写n只允许显式捕获(特定于.NET)

#2


7  

Use it in this manner:

以这种方式使用它:

Regex MyRegex = new Regex(@"(?i:[a-z]+)");

Prefix the inline option to your pattern with (?<option>:<pattern>). In this case the option is "i" for IgnoreCase.

使用(?

By specifying a colon above you are setting the option to just that pattern. To make the option apply to the entire pattern you may set it in the beginning on its own:

通过在上面指定冒号,您可以将选项设置为该模式。要使该选项适用于整个模式,您可以在开头单独设置它:

@"(?i)[a-z]+"

It is also possible to use multiple options and turn them on and off:

也可以使用多个选项并打开和关闭它们:

// On: IgnoreCase, ExplicitCapture. Off: IgnorePatternWhitespace
@"(?in-x)[a-z]+"

This allows for flexibility in a pattern to enable/disable options at different points of a regex that isn't possible when using the RegexOptions on the entire pattern.

这允许模式中的灵活性在正则表达式的不同点处启用/禁用选项,这在整个模式上使用RegexOptions时是不可能的。

Here is a slightly in-depth example. I encourage you to play with it to understand when the options are taking effect.

这是一个稍微深入的例子。我鼓励您使用它来了解选项何时生效。

string input = "H2O (water) is named Dihydrogen Monoxide or Hydrogen Hydroxide. The H represents a hydrogen atom, and O is an Oxide atom.";

// n = explicit captures
// x = ignore pattern whitespace
// -i = remove ignorecase option
string pattern = @"di?(?nx-i) ( hydrogen ) | oxide";
var matches = Regex.Matches(input, pattern, RegexOptions.IgnoreCase);
Console.WriteLine("Total Matches: " + matches.Count);
foreach (Match match in matches)
{
    Console.WriteLine("Match: {0} - Groups: {1}", match.Value, match.Groups[1].Captures.Count);
}

Console.WriteLine();

// n = explicit captures
// x = ignore pattern whitespace
// -i = remove ignorecase option
// -x = remove ignore pattern whitespace
pattern = @"di?(?nx-i) (?<H> hydrogen ) (?-x)|oxide";
matches = Regex.Matches(input, pattern, RegexOptions.IgnoreCase);
Console.WriteLine("Total Matches: " + matches.Count);
foreach (Match match in matches)
{
    Console.WriteLine("Match: {0} - Groups: {1}", match.Value, match.Groups["H"].Captures.Count);
}

The output for the above is:

上面的输出是:

Total Matches: 3
Match: Dihydrogen - Groups: 0
Match: oxide - Groups: 0
Match: oxide - Groups: 0

Total Matches: 3
Match: Dihydrogen - Groups: 1
Match: oxide - Groups: 0
Match: oxide - Groups: 0

In both patterns RegexOptions.IgnoreCase is used which allows "di" to be case insensitive and thus match "Dihydrogen" (capital D). Since explicit capturing is on, the first example fails to have any groups for ( hydrogen ) since it doesn't use a named group, which is the requirement for explicit capturing. The second pattern does have 1 group since it uses (?<H> hydrogen ).

在两种模式中都使用RegexOptions.IgnoreCase,它允许“di”不区分大小写,因此匹配“Dihydrogen”(大写字母D)。由于显式捕获已启用,因此第一个示例无法为(氢)设置任何组,因为它不使用命名组,这是显式捕获的要求。第二种模式确实有1组,因为它使用(? 氢)。

Next, notice that the second pattern is modified to use (?-x)|oxide at the end. Since IgnorePatternWhitespace is disabled after the hydrogen capture, the remainder of the pattern must be correctly formed by not having additional whitespace (compare with the first pattern) until (?x) is turned on later in the pattern. This serves no real purpose but just shows an in-depth usage of inline options to demonstrate when they actually kick in.

接下来,注意第二个图案被修改为最后使用(?-x)|氧化物。由于在氢捕获后禁用了IgnorePatternWhitespace,因此必须通过不使用额外的空白(与第一个模式相比)正确形成模式的其余部分,直到(?x)在模式中稍后打开。这没有任何实际意义,但只是展示了内联选项的深入使用,以展示它们实际启动的时间。