如何在RSS 2.0提要中获取完整条目

时间:2021-12-02 01:09:18

I have used several different scripts that people have suggested for trying to parse RSS including Magpie and the SimpleXML feature in PHP. But none seem to handle RSS 2.0 well because they will not give me back the full content chunk. Does anyone have a suggestion for reading a feed like the one found at http://chacha102.com/feed/, and getting the full content instead of only the description?

我使用了几个不同的脚本,人们建议尝试解析RSS,包括Magpie和PHP中的SimpleXML功能。但似乎没有人能够很好地处理RSS 2.0,因为他们不会给我回馈完整的内容块。有没有人建议您阅读http://chacha102.com/feed/上找到的Feed,并获取完整内容而不仅仅是描述?

2 个解决方案

#1


Without reading any documentation of the rss "content" namespace and how it is to be used, here is a working SimpleXML script. The trick is using the namespace when retreiving the content.

如果没有阅读rss“content”命名空间的任何文档以及如何使用它,这里有一个有效的SimpleXML脚本。诀窍是在检索内容时使用命名空间。

/* the namespace of rss "content" */
$content_ns = "http://purl.org/rss/1.0/modules/content/";

/* load the file */
$rss = file_get_contents("http://chacha102.com/feed/");
/* create SimpleXML object */
$xml = new SimpleXMLElement($rss);
$root=$xml->channel; /* our root element */

foreach($root->item as $item) { /* loop over every item in the channel */
    print "Description: <br>".$item->description."<br><br>";
    print "Full content: <div>";
    foreach($item->children($content_ns) as $content_node) {
        /* loop over all children in the "content" namespace */
        print $content_node."\n";
    }
    print "</div>";
}

#2


What do you have that's not working right now? Parsing RSS should be a trivial process. Try stepping back from excessive libraries and just use a few simple XPath queries or accessing the DOMDocument object in PHP.

你现在有什么不行的?解析RSS应该是一个微不足道的过程。尝试退出过多的库,只需使用一些简单的XPath查询或访问PHP中的DOMDocument对象。

see: PHP DOMDocument

请参阅:PHP DOMDocument

#1


Without reading any documentation of the rss "content" namespace and how it is to be used, here is a working SimpleXML script. The trick is using the namespace when retreiving the content.

如果没有阅读rss“content”命名空间的任何文档以及如何使用它,这里有一个有效的SimpleXML脚本。诀窍是在检索内容时使用命名空间。

/* the namespace of rss "content" */
$content_ns = "http://purl.org/rss/1.0/modules/content/";

/* load the file */
$rss = file_get_contents("http://chacha102.com/feed/");
/* create SimpleXML object */
$xml = new SimpleXMLElement($rss);
$root=$xml->channel; /* our root element */

foreach($root->item as $item) { /* loop over every item in the channel */
    print "Description: <br>".$item->description."<br><br>";
    print "Full content: <div>";
    foreach($item->children($content_ns) as $content_node) {
        /* loop over all children in the "content" namespace */
        print $content_node."\n";
    }
    print "</div>";
}

#2


What do you have that's not working right now? Parsing RSS should be a trivial process. Try stepping back from excessive libraries and just use a few simple XPath queries or accessing the DOMDocument object in PHP.

你现在有什么不行的?解析RSS应该是一个微不足道的过程。尝试退出过多的库,只需使用一些简单的XPath查询或访问PHP中的DOMDocument对象。

see: PHP DOMDocument

请参阅:PHP DOMDocument