Can someone explain me what is the difference between [0-9]+
and [0-9]++
?
有人能解释一下[0-9]+和[0-9]++的区别吗?
2 个解决方案
#1
14
The PCRE engine, which PHP uses for regular expressions, supports "possessive quantifiers":
PHP用于正则表达式的PCRE引擎支持“所有格量词”:
Quantifiers followed by
+
are "possessive". They eat as many characters as possible and don't return to match the rest of the pattern. Thus.*abc
matches"aabc"
but.*+abc
doesn't because.*+
eats the whole string. Possessive quantifiers can be used to speed up processing.后跟+的量词是“所有格”。他们吃尽可能多的字符,不返回来匹配其余的模式。因此。*abc与“aabc”匹配,但是。*+abc并不是因为。*+吃掉了整个字符串。所有格量词可以用来加速处理。
And:
和:
If the PCRE_UNGREEDY option is set (an option which is not available in Perl) then the quantifiers are not greedy by default, but individual ones can be made greedy by following them with a question mark. In other words, it inverts the default behaviour.
如果设置了pcre_un贪婪选项(在Perl中是不可用的选项),那么在默认情况下,量词不是贪婪的,但是单个的量词可以通过跟随问号而变得贪婪。换句话说,它反转了默认行为。
The difference is thus:
不同之处在于:
/[0-9]+/ - one or more digits; greediness defined by the PCRE_UNGREEDY option
/[0-9]+?/ - one or more digits, but as few as possible (non-greedy)
/[0-9]++/ - one or more digits, but as many as possible (greedy, default)
This snippet visualises the difference when in greedy-by-default mode. Note that the first snippet is functionally the same as the last, because the additional +
is (in a sense) already applied by default.
这段代码显示了在默认模式下的差异。注意,第一个片段在功能上与最后一个片段相同,因为附加的+(在某种意义上)在默认情况下已经被应用。
This snippet visualises the difference when applying PCRE_UNGREEDY (ungreedy-by-default mode). See how the default is reversed.
这个代码片段显示了应用pcre_un贪婪(默认模式下的ungreed)时的差异。查看默认是如何反转的。
#2
4
++
(and ?+
, *+
and {n,m}+
) are called possessive quantifiers.
++(和?+、*+和{n,m}+)被称为所有格量词。
Both [0-9]+
and [0-9]++
match one or more ASCII digits, but the second one will not allow the regex engine to backtrack into the match if that should become necessary for the overall regex to succeed.
[0-9]+和[0-9]+都匹配一个或多个ASCII数字,但如果要使整个regex成功,则第二个将不允许regex引擎回溯到匹配中。
Example:
例子:
[0-9]+0
matches the string 00
, whereas [0-9]++0
doesn't.
匹配字符串00,而[0-9]++0不匹配。
In the first case, [0-9]+
first matches 00
, but then backtracks one character to allow the following 0
to match. In the second case, the ++
prevents this, therefore the entire match fails.
在第一种情况下,[0-9]+首先匹配00,然后回溯一个字符以允许以下0匹配。在第二种情况下,++可以防止这种情况,因此整个匹配失败。
Not all regex flavors support this syntax; some others implement atomic groups instead (or even both).
并不是所有regex风格都支持这种语法;另一些则实现了原子组(甚至两者)。
#1
14
The PCRE engine, which PHP uses for regular expressions, supports "possessive quantifiers":
PHP用于正则表达式的PCRE引擎支持“所有格量词”:
Quantifiers followed by
+
are "possessive". They eat as many characters as possible and don't return to match the rest of the pattern. Thus.*abc
matches"aabc"
but.*+abc
doesn't because.*+
eats the whole string. Possessive quantifiers can be used to speed up processing.后跟+的量词是“所有格”。他们吃尽可能多的字符,不返回来匹配其余的模式。因此。*abc与“aabc”匹配,但是。*+abc并不是因为。*+吃掉了整个字符串。所有格量词可以用来加速处理。
And:
和:
If the PCRE_UNGREEDY option is set (an option which is not available in Perl) then the quantifiers are not greedy by default, but individual ones can be made greedy by following them with a question mark. In other words, it inverts the default behaviour.
如果设置了pcre_un贪婪选项(在Perl中是不可用的选项),那么在默认情况下,量词不是贪婪的,但是单个的量词可以通过跟随问号而变得贪婪。换句话说,它反转了默认行为。
The difference is thus:
不同之处在于:
/[0-9]+/ - one or more digits; greediness defined by the PCRE_UNGREEDY option
/[0-9]+?/ - one or more digits, but as few as possible (non-greedy)
/[0-9]++/ - one or more digits, but as many as possible (greedy, default)
This snippet visualises the difference when in greedy-by-default mode. Note that the first snippet is functionally the same as the last, because the additional +
is (in a sense) already applied by default.
这段代码显示了在默认模式下的差异。注意,第一个片段在功能上与最后一个片段相同,因为附加的+(在某种意义上)在默认情况下已经被应用。
This snippet visualises the difference when applying PCRE_UNGREEDY (ungreedy-by-default mode). See how the default is reversed.
这个代码片段显示了应用pcre_un贪婪(默认模式下的ungreed)时的差异。查看默认是如何反转的。
#2
4
++
(and ?+
, *+
and {n,m}+
) are called possessive quantifiers.
++(和?+、*+和{n,m}+)被称为所有格量词。
Both [0-9]+
and [0-9]++
match one or more ASCII digits, but the second one will not allow the regex engine to backtrack into the match if that should become necessary for the overall regex to succeed.
[0-9]+和[0-9]+都匹配一个或多个ASCII数字,但如果要使整个regex成功,则第二个将不允许regex引擎回溯到匹配中。
Example:
例子:
[0-9]+0
matches the string 00
, whereas [0-9]++0
doesn't.
匹配字符串00,而[0-9]++0不匹配。
In the first case, [0-9]+
first matches 00
, but then backtracks one character to allow the following 0
to match. In the second case, the ++
prevents this, therefore the entire match fails.
在第一种情况下,[0-9]+首先匹配00,然后回溯一个字符以允许以下0匹配。在第二种情况下,++可以防止这种情况,因此整个匹配失败。
Not all regex flavors support this syntax; some others implement atomic groups instead (or even both).
并不是所有regex风格都支持这种语法;另一些则实现了原子组(甚至两者)。