正则表达式预测冒号但忽略替换

I'm trying to detect wether a line in a Makefile is the start of a "rule". Rules have the syntax <rule-name> : <rule-prerequisities>. So, easy right? I'll just look ahead for a colon:

我试图检测Makefile中的一行是“规则”的开头。规则的语法为 : 。那么,对吗?我只是期待一个冒号:

(?=[^:]+:(?!=))

The negative lookahead is there to distinguish between variable assignments. They can have the form FOO := foo.

负向前瞻是为了区分变量赋值。它们的形式可以是FOO:= foo。

However, now I also have things like this:

但是,现在我也有这样的事情:

$(FOO:.c=.o) : baz

Here the variable FOO is evaluated, but every .c occurrence is replaced with .o. Now it'll detect that this is a rule, but in the "wrong" way.

这里评估变量FOO,但是每个.c出现都被替换为.o。现在它会发现这是一个规则,但是以“错误”的方式。

The problem is further compounded by this particular line:

这条特定的线路进一步加剧了这个问题:

ifneq ($(words $(subst :, ,$(CURDIR))), 1)

Here, the lookahead matches, because it finds a colon not followed by an equals sign.

在这里,前瞻匹配,因为它发现冒号后面没有等号。

Basically, what I need to do is I need to lookahead for a colon, but ignore anything inside variable substitutions.

基本上,我需要做的是我需要预测冒号,但忽略变量替换内的任何内容。

TL;DR: How can I lookahead for a colon but ignore variable substitutions?

TL; DR:我怎样才能预测冒号而忽略变量替换?

regex101 link here, I want to match everything except the last three lines.

regex101链接在这里,我想匹配除最后三行之外的所有内容。

1 个解决方案

#1

This isn't pretty, but I think it'll work.

这不是很好,但我认为它会起作用。

^                              # Start of line
[^()\r\n]*                     # Any number of non parentheses characters
(?:\([^()\r\n]*(?:\([^()\r\n]*(?:\([^()\r\n]*(?:\([^()\r\n]*\))?[^()\r\n]*\))?[^()\r\n]*\))?[^()]*\))?
[^()\r\n]*                     # Any number of non parentheses characters
:(?!=)                         # Colon NOT followed by an equal-sign
[^()\r\n]*                     # Any number of non parentheses characters
(?:\([^()\r\n]*(?:\([^()\r\n]*(?:\([^()\r\n]*(?:\([^()\r\n]*\))?[^()\r\n]*\))?[^()\r\n]*\))?[^()]*\))?
[^()\r\n]*                     # Any number of non parentheses characters
$

The two uncommented lines eat up to four levels of nested parentheses. Could be be expanded to more levels if necessary.

两条未注释的行最多使用四层嵌套括号。如有必要,可以扩展到更多级别。

It matches the beginning of the string up to a possible group of parentheses, "skipping" that, and then continues up to a colon not followed by an equal sign. Then it matches up to another possible group of parentheses (skips), and continues to the end of line.

它匹配字符串的开头直到可能的一组括号,“跳过”它,然后继续到冒号后面没有等号。然后它匹配另一组可能的括号(跳过),并继续到行尾。

Caveats: If there are more than one group of parentheses before or after the colon, it won't work. More groups could be added by duplicating the parentheses-eater though ;) But I don't think the aim was to see how complex we can make a regex ;)

警告:如果在冒号之前或之后有多组括号,它将无效。可以通过复制括号 - 食者来添加更多的组;但我不认为目的是看到我们可以制作正则表达式有多复杂;)

Update to your example at regex101 here.

在regex101更新您的示例。

#1