Python正则表达式 - 如何从通配符表达式中捕获多个组?

时间:2022-11-29 20:15:36

I have a Python regular expression that contains a group which can occur zero or many times - but when I retrieve the list of groups afterwards, only the last one is present. Example:

我有一个Python正则表达式,其中包含一个可以出现零次或多次的组 - 但是当我之后检索组列表时,只有最后一个存在。例:

re.search("(\w)*", "abcdefg").groups()

this returns the list ('g',)

这将返回列表('g',)

I need it to return ('a','b','c','d','e','f','g',)

我需要它返回('a','b','c','d','e','f','g',)

Is that possible? How can I do it?

那可能吗?我该怎么做?

2 个解决方案

#1


27  

In addition to Douglas Leeder's solution, here is the explanation:

除了Douglas Leeder的解决方案之外,还有以下解释:

In regular expressions the group count is fixed. Placing a quantifier behind a group does not increase group count (imagine all other group indexes increment because an eralier group matched more than once).

在正则表达式中,组计数是固定的。将量词放在组后面不会增加组计数(想象所有其他组索引都会增加,因为一个更多的组匹配不止一次)。

Groups with quantifiers are the way of making a complex sub-expression atomic, when there is need to match it more than once. The regex engine has no other way than saving the last match only to the group. In short: There is no way to achieve what you want with a single "unarmed" regular expression, and you have to find another way.

当需要多次匹配时,具有量词的组是使复杂子表达式成为原子的方式。正则表达式引擎除了仅将最后一个匹配保存到组之外别无他法。简而言之:用单一的“徒手”正则表达式无法达到你想要的效果,你必须找到另一种方式。

#2


37  

re.findall(r"\w","abcdefg")

#1


27  

In addition to Douglas Leeder's solution, here is the explanation:

除了Douglas Leeder的解决方案之外,还有以下解释:

In regular expressions the group count is fixed. Placing a quantifier behind a group does not increase group count (imagine all other group indexes increment because an eralier group matched more than once).

在正则表达式中,组计数是固定的。将量词放在组后面不会增加组计数(想象所有其他组索引都会增加,因为一个更多的组匹配不止一次)。

Groups with quantifiers are the way of making a complex sub-expression atomic, when there is need to match it more than once. The regex engine has no other way than saving the last match only to the group. In short: There is no way to achieve what you want with a single "unarmed" regular expression, and you have to find another way.

当需要多次匹配时,具有量词的组是使复杂子表达式成为原子的方式。正则表达式引擎除了仅将最后一个匹配保存到组之外别无他法。简而言之:用单一的“徒手”正则表达式无法达到你想要的效果,你必须找到另一种方式。

#2


37  

re.findall(r"\w","abcdefg")