如何在$节点内获取html而不仅仅是$ nodeValue

时间:2022-09-01 23:17:34

Description of the current situation:

I have a folder full of pages (pages-folder), each page inside that folder has (among other things) a div with id="short-info".
I have a code that pulls all the <div id="short-info">...</div> from that folder and displays the text inside it by using textContent (which is for this purpose the same as nodeValue)

我有一个装满页面的文件夹(pages-folder),该文件夹中的每个页面都有(除其他外)一个id =“short-info”的div。我有一个代码从该文件夹中提取所有

... ,并使用textContent(为此目的与nodeValue相同)显示其中的文本

The code that loads the divs:

加载div的代码:

<?php
$filename = glob("pages-folder/*.php");
sort($filename);
foreach ($filename as $filenamein) {
    $doc = new DOMDocument();
    $doc->loadHTMLFile($filenamein);
    $xpath = new DOMXpath($doc);
    $elements = $xpath->query("*//div[@id='short-info']");

        foreach ($elements as $element) {
            $nodes = $element->childNodes;
            foreach ($nodes as $node) {
                echo $node->textContent;
            }
        }
}
?>

Now the problem is that if the page I am loading has a child, like an image: <div id="short-info"> <img src="picture.jpg"> Hello world </div>, the output will only be Hello world rather than the image and then Hello world.

现在的问题是,如果我加载的页面有一个孩子,如图像:

如何在$节点内获取html而不仅仅是$ nodeValue Hello world ,输出只会是世界而不是图像,然后是Hello world。

Question:

How do I make the code display the full html inside the div id="short-info" including for instance that image rather than just the text?

如何使代码在div id =“short-info”中显示完整的html,包括例如图像而不仅仅是文本?

2 个解决方案

#1


34  

You have to make an undocumented call on the node.

您必须在节点上进行未记录的调用。

$node->c14n() Will give you the HTML contained in $node.

$ node-> c14n()将为您提供$ node中包含的HTML。

Crazy right? I lost some hair over that one.

疯了吧?我失去了一些头发。

http://php.net/manual/en/class.domnode.php#88441

Update

This will modify the html to conform to strict HTML. It is better to use

这将修改html以符合严格的HTML。最好使用

$html = $Node->ownerDocument->saveHTML( $Node );

$ html = $ Node-> ownerDocument-> saveHTML($ Node);

Instead.

#2


2  

You'd want what amounts to 'innerHTML', which PHP's dom doesn't directly support. One workaround for it is here in the PHP docs.

你想要的是什么相当于'innerHTML',这是PHP的dom不直接支持的。其中一个解决方法是在PHP文档中。

Another option is to take the $node you've found, insert it as the top-level element of a new DOM document, and then call saveHTML() on that new document.

另一个选择是获取您找到的$节点,将其作为新DOM文档的*元素插入,然后在该新文档上调用saveHTML()。

#1


34  

You have to make an undocumented call on the node.

您必须在节点上进行未记录的调用。

$node->c14n() Will give you the HTML contained in $node.

$ node-> c14n()将为您提供$ node中包含的HTML。

Crazy right? I lost some hair over that one.

疯了吧?我失去了一些头发。

http://php.net/manual/en/class.domnode.php#88441

Update

This will modify the html to conform to strict HTML. It is better to use

这将修改html以符合严格的HTML。最好使用

$html = $Node->ownerDocument->saveHTML( $Node );

$ html = $ Node-> ownerDocument-> saveHTML($ Node);

Instead.

#2


2  

You'd want what amounts to 'innerHTML', which PHP's dom doesn't directly support. One workaround for it is here in the PHP docs.

你想要的是什么相当于'innerHTML',这是PHP的dom不直接支持的。其中一个解决方法是在PHP文档中。

Another option is to take the $node you've found, insert it as the top-level element of a new DOM document, and then call saveHTML() on that new document.

另一个选择是获取您找到的$节点,将其作为新DOM文档的*元素插入,然后在该新文档上调用saveHTML()。