如何删除所有标记及其各自的内容,包括其他嵌套元素?

时间:2023-01-23 21:14:33

I've tried a few solutions which only remove the the tags themselves leaving the content and any other nested

我已经尝试了一些解决方案,只删除标签本身留下内容和任何其他嵌套

Regular Expression,

preg_replace('/<span\b[^>]*>(.*?)<\/span>/ig', '', $page->body);

Tried using HTML purifier also,

尝试使用HTML净化器,

$purifier->set('Core.HiddenElements', array('span'));

$purifier->set('HTML.ForbiddenElements', array('span')); 

1 个解决方案

#1


2  

Depending on your actual strings and the things you tried you could use a regular expression (assuming your span tags are only span tags). A more "appropriate" solution however would be to use an html parser like DomDocument.

根据您的实际字符串和您尝试的内容,您可以使用正则表达式(假设您的span标记只是span标记)。然而,更合适的解决方案是使用像DomDocument这样的html解析器。

You can use the function document.getElementsByName("span"); to get all the span elements and remove them from the document object.
Then use saveHTML to get the html code back.

您可以使用函数document.getElementsByName(“span”);获取所有span元素并从文档对象中删除它们。然后使用saveHTML来获取html代码。

You will get something like this:

你会得到这样的东西:

$doc = new DOMDocument;
$doc->load($yourpage);

$root = $doc->documentElement;

// we retrieve the spans and remove it from the book
$spans = $book->getElementsByTagName('span');
foreach ($spans as $span){
    $root->removeChild($span);
}

echo $doc->saveXML();

#1


2  

Depending on your actual strings and the things you tried you could use a regular expression (assuming your span tags are only span tags). A more "appropriate" solution however would be to use an html parser like DomDocument.

根据您的实际字符串和您尝试的内容,您可以使用正则表达式(假设您的span标记只是span标记)。然而,更合适的解决方案是使用像DomDocument这样的html解析器。

You can use the function document.getElementsByName("span"); to get all the span elements and remove them from the document object.
Then use saveHTML to get the html code back.

您可以使用函数document.getElementsByName(“span”);获取所有span元素并从文档对象中删除它们。然后使用saveHTML来获取html代码。

You will get something like this:

你会得到这样的东西:

$doc = new DOMDocument;
$doc->load($yourpage);

$root = $doc->documentElement;

// we retrieve the spans and remove it from the book
$spans = $book->getElementsByTagName('span');
foreach ($spans as $span){
    $root->removeChild($span);
}

echo $doc->saveXML();