为什么这个正则表达式模式会匹配字符串中的额外字符?

时间:2022-12-08 21:42:49

It is becoming hard for me to learn regular expressions, see the following python regular expression code snippet.

学习正则表达式变得越来越困难,请参见下面的python正则表达式代码片段。

>>> import re
>>> str = "demo"
>>> re.search("d?mo",str)
<_sre.SRE_Match object at 0x00B65330>

In the above example, why it is returning the matching object, even it is not?

在上面的例子中,为什么它返回匹配的对象,即使它不是?

I know, symbol '?' means it will match either 0 or 1 repetitions of the preceding character, but

我知道,象征”?意思是它将匹配前一个字符的0或1个重复,但是

From the above example,

从上面的例子中,

1.'d' is matched with 'd'
2.'m' is matched with 'm'
3.'o' is matched with 'o'

But with which character 'e' is matched? Accoding to my understanding, only 'dmo' or 'mo' have to be matched with the given pattern, but why 'demo'.

但是与哪个字符e匹配呢?按照我的理解,只有'dmo'或'mo'必须与给定的模式匹配,但为什么要'demo'。

If I want to match only 'dmo' or 'mo', what is the correct pattern?

如果我只想匹配'dmo'或'mo',正确的模式是什么?

2 个解决方案

#1


1  

That is because your are doing re.search instead of re.match. If you want to match the whole string, you have to do:

那是因为你做的是。search而不是。match。如果你想匹配整个字符串,你必须:

re.match("d?mo$",str)

Alternatively, you can also do:

你也可以:

re.search("^d?mo$",str)

to achieve a similar effect

达到类似的效果

#2


2  

re.search('R', str) is effectively the same as re.match('.*R', str) for regexes R.

re.search('R', str)与re.match(')实际上是一样的。*R', str)用于regexes R。

So you have (effectively... ignoring newlines)

所以你有(有效…忽略换行)

re.match(".*d?mo", "demo")

where the .* matches "de", the d? matches "" and the mo matches "mo".

*与de匹配,d?匹配"和mo匹配"mo"。


You can check this with a capturing group:

你可以用一个捕捉组来检查:

re.search("(d?mo)", "demo").group(0)
#>>> 'mo'

The d? matches nothing, as it's optional.

d ?不匹配,因为它是可选的。

#1


1  

That is because your are doing re.search instead of re.match. If you want to match the whole string, you have to do:

那是因为你做的是。search而不是。match。如果你想匹配整个字符串,你必须:

re.match("d?mo$",str)

Alternatively, you can also do:

你也可以:

re.search("^d?mo$",str)

to achieve a similar effect

达到类似的效果

#2


2  

re.search('R', str) is effectively the same as re.match('.*R', str) for regexes R.

re.search('R', str)与re.match(')实际上是一样的。*R', str)用于regexes R。

So you have (effectively... ignoring newlines)

所以你有(有效…忽略换行)

re.match(".*d?mo", "demo")

where the .* matches "de", the d? matches "" and the mo matches "mo".

*与de匹配,d?匹配"和mo匹配"mo"。


You can check this with a capturing group:

你可以用一个捕捉组来检查:

re.search("(d?mo)", "demo").group(0)
#>>> 'mo'

The d? matches nothing, as it's optional.

d ?不匹配,因为它是可选的。