PHP preg_replace不会用多行替换普通字符串

时间:2023-02-12 08:46:50

I'm having a really weird problem with preg_replace here (and as far as I can remember, this isn't the first time I've seen this). I have an XML with an element with invalid structure (closing tag is missing the slash, breaks parser):

我在preg_replace上遇到了一个非常奇怪的问题(据我所知,这不是我第一次看到这个问题)。我有一个具有无效结构的元素的XML(结束标记缺少斜线、断线解析器):

<info> 
<datetime>2013.04.12 12:04:02</datetime> 
<info> 

What I'm trying to do is this: $xml = preg_replace('/<info>.*<info>/iu', '', $xml) (because I don't actually need that element), but IT DOES NOT REPLACE.
How do I make it work?

我要做的是:$xml = preg_replace('/ )。* /iu',“,$xml)(因为我实际上不需要那个元素),但它不会替换。我如何让它工作?

4 个解决方案

#1


3  

Add the s modifier and use ? to make it non-greedy:

添加修改器并使用?让它贪婪的:

$string = '<info> 
<datetime>2013.04.12 12:04:02</datetime> 
<info>
<valid>2013.04.12 12:04:02</valid>
<info> 
<datetime>2013.04.12 12:04:02</datetime> 
<info>';
var_dump(preg_replace('/<info>.*?<info>/s', '', $string));

#2


4  

Try adding the s modifier to the regex rule. Will not stop matching at new line

尝试在regex规则中添加s修饰符。在新线上不会停止匹配吗?

#3


4  

It doesn't replace becase there aren't matches:

它不能代替因为没有匹配:

<?php

$xml = '<info>
    <datetime>2013.04.12 12:04:02</datetime>
<info>';
var_dump(preg_match('/<info>.*<info>/iu', $xml, $matches), $matches);
int(0)
array(0) {
}

Let's see what's wrong. What does . mean exactly?

看看有什么问题。什么。究竟意味着什么?

match any character except newline (by default)

匹配除新行以外的任何字符(默认情况下)

So there it is! How do you change the default? We have a look at the available internal options and find this:

所以啊!如何更改默认值?我们查看了可用的内部选项,发现如下:

s for PCRE_DOTALL

年代PCRE_DOTALL

.... where PCRE_DOTALL means:

....PCRE_DOTALL意味着:

s (PCRE_DOTALL)
If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded.

如果设置了这个修饰符,模式中的点元字符将匹配所有字符,包括换行符。没有它,就排除了换行。

We can change it locally:

我们可以在本地更改:

'/<info>(?s:.*)<info>/iu'
          ^

... or globally:

…或在全球范围内:

'/<info>.*<info>/ius'
                   ^

#4


2  

See http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php

参见http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php

You need to use the s modifier at the end of your regex.

您需要在regex末尾使用s修饰符。

$xml = preg_replace('/<info>.*<info>/ius', '', $xml);

#1


3  

Add the s modifier and use ? to make it non-greedy:

添加修改器并使用?让它贪婪的:

$string = '<info> 
<datetime>2013.04.12 12:04:02</datetime> 
<info>
<valid>2013.04.12 12:04:02</valid>
<info> 
<datetime>2013.04.12 12:04:02</datetime> 
<info>';
var_dump(preg_replace('/<info>.*?<info>/s', '', $string));

#2


4  

Try adding the s modifier to the regex rule. Will not stop matching at new line

尝试在regex规则中添加s修饰符。在新线上不会停止匹配吗?

#3


4  

It doesn't replace becase there aren't matches:

它不能代替因为没有匹配:

<?php

$xml = '<info>
    <datetime>2013.04.12 12:04:02</datetime>
<info>';
var_dump(preg_match('/<info>.*<info>/iu', $xml, $matches), $matches);
int(0)
array(0) {
}

Let's see what's wrong. What does . mean exactly?

看看有什么问题。什么。究竟意味着什么?

match any character except newline (by default)

匹配除新行以外的任何字符(默认情况下)

So there it is! How do you change the default? We have a look at the available internal options and find this:

所以啊!如何更改默认值?我们查看了可用的内部选项,发现如下:

s for PCRE_DOTALL

年代PCRE_DOTALL

.... where PCRE_DOTALL means:

....PCRE_DOTALL意味着:

s (PCRE_DOTALL)
If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded.

如果设置了这个修饰符,模式中的点元字符将匹配所有字符,包括换行符。没有它,就排除了换行。

We can change it locally:

我们可以在本地更改:

'/<info>(?s:.*)<info>/iu'
          ^

... or globally:

…或在全球范围内:

'/<info>.*<info>/ius'
                   ^

#4


2  

See http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php

参见http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php

You need to use the s modifier at the end of your regex.

您需要在regex末尾使用s修饰符。

$xml = preg_replace('/<info>.*<info>/ius', '', $xml);