使用正则表达式如何在不包含周围字符串的情况下找到由其他两个图案包围的图案?

时间:2022-12-21 11:25:24

I want to use regular expressions (Perl compatible) to be able to find a pattern surrounded by two other patterns, but not include the strings matching the surrounding patterns in the match.

我想使用正则表达式(Perl兼容)能够找到由其他两个模式包围的模式,但不包括匹配匹配中周围模式的字符串。

For example, I want to be able to find occurrences of strings like:

例如,我希望能够找到如下字符串的出现:

Foo Bar Baz

Foo Bar Baz

But only have the match include the middle part:

但只有匹配包括中间部分:

Bar

I know this is possible, but I can't remember how to do it.

我知道这是可能的,但我不记得该怎么做了。

4 个解决方案

#1


4  

In the general case, you probably can't. The simplest approach is to match everything and use backreferences to capture the portion of interest:

在一般情况下,你可能不会。最简单的方法是匹配所有内容并使用反向引用来捕获感兴趣的部分:

Foo\s+(Bar)\s+Baz

This isn't the same as not including the surrounding text in the match though. That probably doesn't matter if all you want to do is extract "Bar" but would matter if you're matching against the same string multiple times and need to continue from where the previous match left off.

这与不包括匹配中的周围文本不同。如果您想要做的只是提取“Bar”,那可能无关紧要,但如果您多次匹配同一个字符串并且需要从上一个匹配中断的位置继续,则无关紧要。

Look-around will work in some cases. Tomalak's suggestion:

在某些情况下,环视会起作用。托马拉克的建议:

(?<=Foo\s)Bar(?=\sBaz)

only works for fixed width look-behind (at least in Perl). As of Perl 5.10, the \K assertion can be used to effectively provide variable width look-behind:

仅适用于固定宽度的后视(至少在Perl中)。从Perl 5.10开始,\ K断言可用于有效地提供可变宽度的后视:

Foo\s+\KBar(?=\s+Baz)

which should be capable of doing what you asked for in all cases, but would require that you're implementing this in Perl 5.10.

它应该能够在所有情况下完成你所要求的,但是需要你在Perl 5.10中实现它。

While it would be convenient, there's no equivalent of \K for ending the matched text, so you have to use a look-ahead.

虽然它很方便,但是没有等效的\ K来结束匹配的文本,所以你必须使用前瞻。

#2


7  

Parentheses define the groupings.

括号定义分组。

"Foo (Bar) Baz"

Example

~> cat test.pl
$a = "The Foo Bar Baz was lass";

$a =~ m/Foo (Bar) Baz/;

print $1,"\n";
~> perl test.pl
Bar

#3


4  

Use lookaround:

(?<=Foo\s)Bar(?=\sBaz)

This would match any "Bar" that is preceded by "Foo" and followed by "Baz", separated through a single white space. "Foo" and "Baz" would not be part of the final match.

这将匹配任何“Bar”,前面是“Foo”,后面是“Baz”,通过单个空格分隔。 “Foo”和“Baz”不会成为最后一场比赛的一部分。

#4


2  

$string =~ m/Foo (Bar) Baz/

$ string = ~m / Foo(Bar)Baz /

$1

This may not be exactly what you want as the match is still "Foo Bar Baz". But it shows you how to just get the part that you are interested in. Otherwise you can use lookahead and lookbehind to get the match without consuming characters...

这可能不是你想要的,因为比赛仍然是“Foo Bar Baz”。但它告诉你如何获得你感兴趣的部分。否则你可以使用lookahead和lookbehind来获得匹配而不消耗字符......

#1


4  

In the general case, you probably can't. The simplest approach is to match everything and use backreferences to capture the portion of interest:

在一般情况下,你可能不会。最简单的方法是匹配所有内容并使用反向引用来捕获感兴趣的部分:

Foo\s+(Bar)\s+Baz

This isn't the same as not including the surrounding text in the match though. That probably doesn't matter if all you want to do is extract "Bar" but would matter if you're matching against the same string multiple times and need to continue from where the previous match left off.

这与不包括匹配中的周围文本不同。如果您想要做的只是提取“Bar”,那可能无关紧要,但如果您多次匹配同一个字符串并且需要从上一个匹配中断的位置继续,则无关紧要。

Look-around will work in some cases. Tomalak's suggestion:

在某些情况下,环视会起作用。托马拉克的建议:

(?<=Foo\s)Bar(?=\sBaz)

only works for fixed width look-behind (at least in Perl). As of Perl 5.10, the \K assertion can be used to effectively provide variable width look-behind:

仅适用于固定宽度的后视(至少在Perl中)。从Perl 5.10开始,\ K断言可用于有效地提供可变宽度的后视:

Foo\s+\KBar(?=\s+Baz)

which should be capable of doing what you asked for in all cases, but would require that you're implementing this in Perl 5.10.

它应该能够在所有情况下完成你所要求的,但是需要你在Perl 5.10中实现它。

While it would be convenient, there's no equivalent of \K for ending the matched text, so you have to use a look-ahead.

虽然它很方便,但是没有等效的\ K来结束匹配的文本,所以你必须使用前瞻。

#2


7  

Parentheses define the groupings.

括号定义分组。

"Foo (Bar) Baz"

Example

~> cat test.pl
$a = "The Foo Bar Baz was lass";

$a =~ m/Foo (Bar) Baz/;

print $1,"\n";
~> perl test.pl
Bar

#3


4  

Use lookaround:

(?<=Foo\s)Bar(?=\sBaz)

This would match any "Bar" that is preceded by "Foo" and followed by "Baz", separated through a single white space. "Foo" and "Baz" would not be part of the final match.

这将匹配任何“Bar”,前面是“Foo”,后面是“Baz”,通过单个空格分隔。 “Foo”和“Baz”不会成为最后一场比赛的一部分。

#4


2  

$string =~ m/Foo (Bar) Baz/

$ string = ~m / Foo(Bar)Baz /

$1

This may not be exactly what you want as the match is still "Foo Bar Baz". But it shows you how to just get the part that you are interested in. Otherwise you can use lookahead and lookbehind to get the match without consuming characters...

这可能不是你想要的,因为比赛仍然是“Foo Bar Baz”。但它告诉你如何获得你感兴趣的部分。否则你可以使用lookahead和lookbehind来获得匹配而不消耗字符......