剪切与模式匹配的第一个单词(来自字符串)

时间:2023-01-30 14:02:45

I have a sentence:

我有一句话:

"This 'is' just an example"

“这只是一个例子”

I need to cut the word between first ' ' characters.

我需要在第一个''字符之间剪切。

Up until now, I was using following Regex method:

到目前为止,我正在使用以下Regex方法:

string name_only = Regex.Match("This 'is' just an example", @"\'([^)]*)\'").Groups[1].Value;

Result: is

结果:是

and it worked perfectly fine, until another ' appeared:

它工作得很好,直到另一个'出现:

"This 'is' just an e'xample"

“这'只是一个例子”

now I'm getting:

现在我得到了:

Result: is' just an e

结果:'只是一个e

how do I fix this issue (other than iterating using the "for" cycle and finding first two inexes of character ' and then cutting the word using the substring) ?

我如何解决这个问题(除了使用“for”循环迭代并找到前两个字符'然后使用子字符串切割单词)?

3 个解决方案

#1


2  

The problem is that your regex acts in a greedy way and if you change it to the following it will work:

问题是你的正则表达式是以贪婪的方式行事,如果你把它改成以下它会起作用:

@"\'([^)]*?)\'"

#2


1  

By default regular expression follow the "leftmost longest rule": the match the leftmost, longest substring possible.

默认情况下,正则表达式遵循“最左边最长的规则”:匹配最左边,最长的子字符串。

I'd be inclined to make the regular expression more specific about what it should match, thus:

我倾向于使正则表达式更具体地说明它应该匹配的内容,因此:

'(([^']|(''))*)'

That should match:

这应该匹配:

  • The lead-in single-quote character, followed by
  • 引号单引号字符,后跟
  • zero or more instances of
    • a single character other than a single-quote character, or
    • 单引号字符以外的单个字符,或
    • an "escaped" single-quote character: two consecutive single-quote characters,
    • 一个“转义”的单引号字符:两个连续的单引号字符,
  • 除单引号字符之外的单个字符的零个或多个实例,或“转义”单引号字符:两个连续的单引号字符,
  • followed by the lead-out single-quote character.
  • 然后是引出单引号字符。

$0 then gives you the entire match, and $1 the contents of the matched quoted value, exclusive of the lead-in/lead-out quotes.

$ 0然后给你整个匹配,$ 1匹配的报价值的内容,不包括引入/引出报价。

#3


0  

http://msdn.microsoft.com/en-us/library/3206d374.aspx#Greedy

http://msdn.microsoft.com/en-us/library/3206d374.aspx#Greedy

Greedy and Lazy Quantifiers

A number of the quantifiers have two versions:

许多量词有两个版本:

  1. A greedy version.

    一个贪婪的版本。

    A greedy quantifier tries to match an element as many times as possible.

    贪婪的量词尝试尽可能多地匹配元素。

  2. A non-greedy (or lazy) version.

    非贪婪(或懒惰)版本。

    A non-greedy quantifier tries to match an element as few times as possible. You can turn a greedy quantifier into a lazy quantifier by simply adding a ?.

    非贪婪量词尝试尽可能少地匹配元素。你只需添加一个?就可以将贪婪的量词变成一个懒惰的量词。

#1


2  

The problem is that your regex acts in a greedy way and if you change it to the following it will work:

问题是你的正则表达式是以贪婪的方式行事,如果你把它改成以下它会起作用:

@"\'([^)]*?)\'"

#2


1  

By default regular expression follow the "leftmost longest rule": the match the leftmost, longest substring possible.

默认情况下,正则表达式遵循“最左边最长的规则”:匹配最左边,最长的子字符串。

I'd be inclined to make the regular expression more specific about what it should match, thus:

我倾向于使正则表达式更具体地说明它应该匹配的内容,因此:

'(([^']|(''))*)'

That should match:

这应该匹配:

  • The lead-in single-quote character, followed by
  • 引号单引号字符,后跟
  • zero or more instances of
    • a single character other than a single-quote character, or
    • 单引号字符以外的单个字符,或
    • an "escaped" single-quote character: two consecutive single-quote characters,
    • 一个“转义”的单引号字符:两个连续的单引号字符,
  • 除单引号字符之外的单个字符的零个或多个实例,或“转义”单引号字符:两个连续的单引号字符,
  • followed by the lead-out single-quote character.
  • 然后是引出单引号字符。

$0 then gives you the entire match, and $1 the contents of the matched quoted value, exclusive of the lead-in/lead-out quotes.

$ 0然后给你整个匹配,$ 1匹配的报价值的内容,不包括引入/引出报价。

#3


0  

http://msdn.microsoft.com/en-us/library/3206d374.aspx#Greedy

http://msdn.microsoft.com/en-us/library/3206d374.aspx#Greedy

Greedy and Lazy Quantifiers

A number of the quantifiers have two versions:

许多量词有两个版本:

  1. A greedy version.

    一个贪婪的版本。

    A greedy quantifier tries to match an element as many times as possible.

    贪婪的量词尝试尽可能多地匹配元素。

  2. A non-greedy (or lazy) version.

    非贪婪(或懒惰)版本。

    A non-greedy quantifier tries to match an element as few times as possible. You can turn a greedy quantifier into a lazy quantifier by simply adding a ?.

    非贪婪量词尝试尽可能少地匹配元素。你只需添加一个?就可以将贪婪的量词变成一个懒惰的量词。