无法解析成标签 - PHP - 简单的html dom

时间:2022-10-29 16:34:58

I am trying to extract the content of a <div> nested inside a <code> tag with PHP Simple HTML DOM Parser but I am always getting the error Trying to get property of non-object in... as if the parser was finding nothing inside my <div>

我试图用PHP Simple HTML DOM Parser提取嵌套在标签内的

的内容,但我总是得到错误试图获取非对象的属性...好像解析器正在查找我的
里面什么都没有

The code I'm using is

我正在使用的代码是

include_once('simplehtmldom_1_5/simple_html_dom.php');

// Create a DOM object
$html = new simple_html_dom();

// Load HTML
$html->load('<code><div>hello</div></code>');

// Extract div content
echo $html->find('div',0)->innertext;

But if instead of using <code><div>hello</div></code> as my sample code i use <span><div>hello</div></span> it works... it seems like I'm having problems only looking inside the code tag.

但是如果不使用

hello 作为我的示例代码,我使用
hello 它可以工作......好像我只是在代码标签内部遇到问题。

What's wrong with what i'm doing? Hope you guys can point me in the right direction, thank you very much for your support!

我正在做什么有什么问题?希望你们能指出我正确的方向,非常感谢你的支持!

2 个解决方案

#1


1  

simplehtmldom among others strips out pre formatted tags. If you want code tag to be recognized delete or comment out line 1076 in *simple_html_dom.php*

simplehtmldom除其他外删除预格式化的标签。如果您希望识别代码标签,请删除或注释掉* simple_html_dom.php中的第1076行*

#2


0  

According to the source code for Simple HTML DOM it automagically removes code tags when it loads the HTML into the parser.

根据Simple HTML DOM的源代码,它在将HTML加载到解析器时自动删除代码标记。

If you need the functionality you'll need to remove the reference to remove_noise() in the load() function within simplehtmldom.php.

如果您需要这些功能,则需要删除simplehtmldom.php中load()函数中对remove_noise()的引用。

This should produce the results you expect, but obviously may well introduce other issues, depending on the authors reasoning for removing the tags in the first place.

这应该会产生您期望的结果,但显然可能会引入其他问题,具体取决于作者首先要删除标记的原因。

#1


1  

simplehtmldom among others strips out pre formatted tags. If you want code tag to be recognized delete or comment out line 1076 in *simple_html_dom.php*

simplehtmldom除其他外删除预格式化的标签。如果您希望识别代码标签,请删除或注释掉* simple_html_dom.php中的第1076行*

#2


0  

According to the source code for Simple HTML DOM it automagically removes code tags when it loads the HTML into the parser.

根据Simple HTML DOM的源代码,它在将HTML加载到解析器时自动删除代码标记。

If you need the functionality you'll need to remove the reference to remove_noise() in the load() function within simplehtmldom.php.

如果您需要这些功能,则需要删除simplehtmldom.php中load()函数中对remove_noise()的引用。

This should produce the results you expect, but obviously may well introduce other issues, depending on the authors reasoning for removing the tags in the first place.

这应该会产生您期望的结果,但显然可能会引入其他问题,具体取决于作者首先要删除标记的原因。