如何将这两个正则表达式组合成一个?

时间:2022-09-15 20:51:34

I have a few thousand strings that have one of these two forms:

我有几千个字符串,有这两种形式之一:

SomeT1tle-ThatL00ks L1k3.this - $3.57 KnownWord

SomeT1tle-ThatL00ks L1k3.this - $ 3.57 KnownWord

SomeT1tle-ThatL00ks L1k3.that - 4.5% KnownWord

SomeT1tle-ThatL00ks L1k3.that - 4.5%KnownWord

The SomeT1tle-ThatL00ks L1ke.this part may contain uppercase and lowercase characters, digits, periods, dashes, and spaces. It is always followed by a space-dash-space pattern.

SomeT1tle-ThatL00ks L1ke.this部分可能包含大写和小写字符,数字,句点,短划线和空格。它始终是一个空间破折号空间模式。

I want to pull out the Title (the part before the space-dash-space separator) and the Amount, which is right before KnownWord.

我想拉出Title(Space-dash-space分隔符之前的部分)和Amount,它就在KnownWord之前。

So for these two strings I'd like:

所以对于这两个字符串,我想:

SomeT1tle-ThatL00ks L1k3.this, $3.57 and

SomeT1tle-ThatL00ks L1k3.this,3.57美元和

SomeT1tle-ThatL00ks L1k3.that, 4.5%.

SomeT1tle-ThatL00ks L1k3.that,4.5%。

This code works (using Perl equivalent Regular Expressions)

此代码有效(使用Perl等效正则表达式)

$my_string = "SomeT1tle-ThatL00ks L1k3.this - $3.57 KnownWord";

$pattern_title = "/^(.*?)\x20\x2d\x20/";
$pattern_amount = "/([0-9.$%]+) KnownWord$/";

preg_match_all($pattern_title, $my_string, $matches_title);
preg_match_all($pattern_amount, $my_string, $matches_amount);

echo $matches_title[1][0] . "  " . $matches_amount[1][0] . "<br>";

I tried putting both patterns together:

我尝试将两种模式放在一起:

$pattern_together_doesnt_work = "/^(.*?)\x20\x2d\x20([0-9.$%]+) KnownWord$/";

but the first part of the pattern always matches the whole thing, even with the "lazy" part (.*? rather than .*). I can't negative-match spaces and dashes, because the title itself can contain either.

但是模式的第一部分总是匹配整个事物,即使是“懒惰”部分(。*?而不是。*)。我不能对空格和破折号进行否定匹配,因为标题本身可以包含任何一个。

Any hints?

1 个解决方案

#1


1  

Use this pattern

使用此模式

/^(.*?)\x20\x2d\x20([0-9.$%]+) KnownWord$/

#1


1  

Use this pattern

使用此模式

/^(.*?)\x20\x2d\x20([0-9.$%]+) KnownWord$/