匹配Python正则表达式中的一些组

时间:2022-11-19 23:19:20

Is it possible to construct a regex that matches as many groups as it can, giving up when the string stops matching? Eg:

是否有可能构造一个匹配尽可能多的组的正则表达式,当字符串停止匹配时放弃?例如:

import re    
s = 'a b'
m = re.search('(\w) (\w) (\w)')

I'd like m.group(1) to contain 'a' and m.group(2) to contain 'b' and m.group(3) to contain None.

我希望m.group(1)包含'a',m.group(2)包含'b',m.group(3)包含None。

But re.search() does not contain any groups in this case.

但是re.search()在这种情况下不包含任何组。

1 个解决方案

#1


3  

The pattern is looking for exactly one word character followed by one space, followed by one word character followed by one space, followed by one word character, but your string is only one letter, one space, and one letter, so will never match. You need to modify the pattern to allow for any optional parts:

该模式正在查找一个单词字符,后跟一个空格,后跟一个单词字符后跟一个空格,后跟一个单词字符,但您的字符串只有一个字母,一个空格和一个字母,因此永远不会匹配。您需要修改模式以允许任何可选部分:

import re
s = 'a b'
m = re.search('(\w) (\w)( (\w))?', s)

Note the parens around the final space and (\w) group. They create another group, which is made optional by the ? modifier. If you don't want this extra group showing up in the match object, you can make it a "non-capturing" group:

注意最终空间和(\ w)组周围的parens。他们创建另一个组,由?修改。如果您不希望在匹配对象中显示此额外组,则可以将其设为“非捕获”组:

m = re.search('(\w) (\w)(?: (\w))?', s)

m will now not include a group for the (optional) final space and word character, but only for any word characters that match.

m现在不包括(可选)最终空格和单词字符的组,但仅适用于匹配的任何单词字符。

#1


3  

The pattern is looking for exactly one word character followed by one space, followed by one word character followed by one space, followed by one word character, but your string is only one letter, one space, and one letter, so will never match. You need to modify the pattern to allow for any optional parts:

该模式正在查找一个单词字符,后跟一个空格,后跟一个单词字符后跟一个空格,后跟一个单词字符,但您的字符串只有一个字母,一个空格和一个字母,因此永远不会匹配。您需要修改模式以允许任何可选部分:

import re
s = 'a b'
m = re.search('(\w) (\w)( (\w))?', s)

Note the parens around the final space and (\w) group. They create another group, which is made optional by the ? modifier. If you don't want this extra group showing up in the match object, you can make it a "non-capturing" group:

注意最终空间和(\ w)组周围的parens。他们创建另一个组,由?修改。如果您不希望在匹配对象中显示此额外组,则可以将其设为“非捕获”组:

m = re.search('(\w) (\w)(?: (\w))?', s)

m will now not include a group for the (optional) final space and word character, but only for any word characters that match.

m现在不包括(可选)最终空格和单词字符的组,但仅适用于匹配的任何单词字符。