xml中具有相同名称的多个节点

时间:2021-12-19 08:34:04

I have this below xml file :-

我有以下xml文件: -

 <item> 
  <title>Troggs singer Reg Presley dies at 71</title>  
  <description>Reg Presley, the lead singer of British rock band The Troggs, whose hits in the 1960s included Wild Thing, has died aged 71.</description>  
  <link>http://www.bbc.co.uk/news/uk-21332048#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
  <guid isPermaLink="false">http://www.bbc.co.uk/news/uk-21332048</guid>  
  <pubDate>Tue, 05 Feb 2013 01:13:07 GMT</pubDate>  
  <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701366_65701359.jpg"/>  
  <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701387_65701359.jpg"/> 
</item>  
<item> 
  <title>Horsemeat found at Newry cold store</title>  
  <description>Horse DNA has been found in frozen meat in a cold store in Northern Ireland, as Irish police investigate a third case of contamination.</description>  
  <link>http://www.bbc.co.uk/news/world-europe-21331208#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
  <guid isPermaLink="false">http://www.bbc.co.uk/news/world-europe-21331208</guid>  
  <pubDate>Mon, 04 Feb 2013 23:47:38 GMT</pubDate>  
  <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65700000/jpg/_65700000_002950295-1.jpg"/>  
  <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65700000/jpg/_65700001_002950295-1.jpg"/> 
</item>  
<item> 
  <title>US 'will sue' Standard &amp; Poor's</title>  
  <description>Standard &amp; Poor's says it is to be sued by the US government over the credit ratings agency's assessment of mortgage bonds before the financial crisis.</description>  
  <link>http://www.bbc.co.uk/news/21331018#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
  <guid isPermaLink="false">http://www.bbc.co.uk/news/21331018</guid>  
  <pubDate>Mon, 04 Feb 2013 22:45:52 GMT</pubDate>  
  <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701717_mediaitem65699884.jpg"/>  
  <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701718_mediaitem65699884.jpg"/> 
   </item>  

Now when i give the input node as "item" to retrieve data , than instead of displaying all the item nodes it just displays the last item node.....

现在,当我将输入节点作为“项目”来检索数据时,而不是显示所有项目节点,它只显示最后一个项目节点.....

My code is :-

我的代码是: -

    $dom->load($url);
    $link = $dom->getElementsByTagName($tag_name);
    $value = array();

    for ($i = 0; $i < $link->length; $i++) {
        $childnode['name'] = $link->item($i)->nodeName;
        $childnode['value'] = $link->item($i)->nodeValue;
        $value[$childnode['name']] = $childnode['value'];
    }

here ,$url is the url of my xml page $tag_name is the name of the node , in this case it is "item"

这里,$ url是我的xml页面的url $ tag_name是节点的名称,在这种情况下它是“item”

The output what i get is :-

我得到的输出是: -

  US 'will sue' Standard &amp; Poor's.Standard &amp; Poor's says it is to be sued by the US government over the credit ratings agency's assessment of mortgage bonds before the financial crisis.http://www.bbc.co.uk/news/21331018#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa.http://www.bbc.co.uk/news/world-europe-21331208.Mon, 04 Feb 2013 22:45:52 GMT

This is the data of the last tags. I want the data of all the item tags and also i want the data to be in this format:-

这是最后一个标签的数据。我想要所有商品标签的数据,我也希望数据采用这种格式: -

title :-  US 'will sue' Standard &amp; Poor's
description :- Standard &amp; Poor's says it is to be sued by the US government over 
the credit ratings agency's assessment of mortgage bonds before the financial crisis

I want even the names of childnodes (if any) in my output... Please help me out....

我甚至想要输出中的childnodes(如果有的话)的名字......请帮帮我....

4 个解决方案

#1


2  

You seem to be looping over the 'item' nodes only and, as other people mentioned, are overwriting the previous value on each iteration.

您似乎只在“项目”节点上循环,并且正如其他人提到的那样,在每次迭代时都会覆盖先前的值。

If your debug the $value array using print_r($value) inside the loop;

如果在循环内使用print_r($ value)调试$ value数组;

$dom->load($url);
$link = $dom->getElementsByTagName($tag_name);
$value = array();

for ($i = 0; $i < $link->length; $i++) {
    $childnode['name'] = $link->item($i)->nodeName;
    $childnode['value'] = $link->item($i)->nodeValue;
    $value[$childnode['name']] = $childnode['value'];

    echo 'iteration: ' . $i . '<br />';
    echo '<pre>'; print_r($value); echo '</pre>';
}

You'll probably see something like this

你可能会看到这样的东西

// iteration: 0
Array
(
    [item] => Troggs singer Reg Presley dies at 71 ......
)

// iteration: 1
Array
(
    [item] => Horsemeat found at Newry cold store .........
)

// iteration: 2
Array
(
    [item] => US 'will sue' Standard & Poor's .........
)

What you should be doing is this:

你应该做的是这样的:

$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->load($url);
$items = $dom->getElementsByTagName($tag_name);
$values = array();

foreach ($items as $item) {
    $itemProperties = array();

    // Loop through the 'sub' items 
    foreach ($item->childNodes as $child) {
        // Note: using 'localName' to remove the namespace
        if (isset($itemProperties[(string) $child->localName])) {
            // Quickfix to support multiple 'thumbnails' per item (although they have no content)
            $itemProperties[$child->localName] = (array) $itemProperties[$child->localName];
            $itemProperties[$child->localName][] = $child->nodeValue;
        } else {
            $itemProperties[$child->localName] = $child->nodeValue;
        }
    }

    // Append the item to the 'values' array
    $values[] = $itemProperties;

}


// Output the result
echo '<pre>'; print_r($values); echo '</pre>';

Which outputs:

Array
(
    [0] => Array
        (
            [title] => Troggs singer Reg Presley dies at 71
            [description] => Reg Presley, the lead singer of British rock band The Troggs, whose hits in the 1960s included Wild Thing, has died aged 71.
            [link] => http://www.bbc.co.uk/news/uk-21332048#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa
            [guid] => http://www.bbc.co.uk/news/uk-21332048
            [pubDate] => Tue, 05 Feb 2013 01:13:07 GMT
            [thumbnail] => Array
                (
                    [0] => 
                    [1] => 
                )

        )

    [1] => Array
        (
            [title] => Horsemeat found at Newry cold store
            [description] => Horse DNA has been found in frozen meat in a cold store in Northern Ireland, as Irish police investigate a third case of contamination.
            [link] => http://www.bbc.co.uk/news/world-europe-21331208#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa
            [guid] => http://www.bbc.co.uk/news/world-europe-21331208
            [pubDate] => Mon, 04 Feb 2013 23:47:38 GMT
            [thumbnail] => Array
                (
                    [0] => 
                    [1] => 
                )

        )

    [2] => Array
        (
            [title] => US 'will sue' Standard & Poor's
            [description] => Standard & Poor's says it is to be sued by the US government over the credit ratings agency's assessment of mortgage bonds before the financial crisis.
            [link] => http://www.bbc.co.uk/news/21331018#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa
            [guid] => http://www.bbc.co.uk/news/21331018
            [pubDate] => Mon, 04 Feb 2013 22:45:52 GMT
            [thumbnail] => Array
                (
                    [0] => 
                    [1] => 
                )

        )

)

#2


2  

(Don't forget the root node.) It looks like one of the methods is just concatenating all of the text nodes under that element together (just about equivalent to a xsl:value-of select=.). I've never done much with the DOMDocument class and related classes in PHP. But what you can do is canonicalize the DOMNode using the C14N() method, and then parse the resulting string. It isn't pretty, but it gets the result you want and is easily extensible:

(不要忘记根节点。)看起来其中一个方法就是将该元素下的所有文本节点连接在一起(几乎相当于一个xsl:value-of select =。)。我从未在PHP中使用DOMDocument类和相关类做过多。但是你可以做的是使用C14N()方法规范化DOMNode,然后解析生成的字符串。它不漂亮,但它可以获得您想要的结果并且易于扩展:

    $tag_name = 'item';
    $link = $dom->getElementsByTagName($tag_name);
    for ($i = 0; $i < $link->length; $i++) {
        $treeAsString = $link->item($i)->C14N();
        $curBranchParts = explode("\n",$treeAsString);
        $curBranchPartsSize = count($curBranchParts);
        $curBranchParts = explode("\n",$treeAsString);
        $curBranchPartsSize = count($curBranchParts);
        for ($j = 1; $j < ($curBranchPartsSize - 1); $j++) { 
            $curItem = $curBranchParts[$j];
            $curItemParts = explode('<', $curItem);
            $tagWithContent = $curItemParts[1];
            $tagWithContentParts = explode('>',$tagWithContent);
            $tag = $tagWithContentParts[0];
            $content = $tagWithContentParts[1];

            if (trim($content) != '') echo $tag . ' :- ' . $content . '<br />';
            else echo $tag . '<br />';   
        }
    }

#3


1  

Your problem is that your source XML needs to have a root node (it can be called whatever you want). To be valid XML, you always need a root node. That is, every valid XML file will have exactly one element that has no parent or sibling. Once you have the root node, then your XML will load into your object.

您的问题是您的源XML需要有一个根节点(可以随意调用它)。要成为有效的XML,您始终需要一个根节点。也就是说,每个有效的XML文件都只有一个没有父元素或兄弟元素的元素。获得根节点后,您的XML将加载到您的对象中。

For example:

<root>
    <item> 
      <title>Troggs singer Reg Presley dies at 71</title>  
      <description>Reg Presley, the lead singer of British rock band The Troggs, whose hits in the 1960s included Wild Thing, has died aged 71.</description>  
      <link>http://www.bbc.co.uk/news/uk-21332048#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <guid isPermaLink="false">http://www.bbc.co.uk/news/uk-21332048</guid>  
      <pubDate>Tue, 05 Feb 2013 01:13:07 GMT</pubDate>  
      <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701366_65701359.jpg"/>  
      <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701387_65701359.jpg"/> 
    </item>  
    <item> 
      <title>Horsemeat found at Newry cold store</title>  
      <description>Horse DNA has been found in frozen meat in a cold store in Northern Ireland, as Irish police investigate a third case of contamination.</description>  
      <link>http://www.bbc.co.uk/news/world-europe-21331208#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <guid isPermaLink="false">http://www.bbc.co.uk/news/world-europe-21331208</guid>  
      <pubDate>Mon, 04 Feb 2013 23:47:38 GMT</pubDate>  
      <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65700000/jpg/_65700000_002950295-1.jpg"/>  
      <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65700000/jpg/_65700001_002950295-1.jpg"/> 
    </item>  
    <item> 
      <title>US 'will sue' Standard &amp; Poor's</title>  
      <description>Standard &amp; Poor's says it is to be sued by the US government over the credit ratings agency's assessment of mortgage bonds before the financial crisis.</description>  
      <link>http://www.bbc.co.uk/news/21331018#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <guid isPermaLink="false">http://www.bbc.co.uk/news/21331018</guid>  
      <pubDate>Mon, 04 Feb 2013 22:45:52 GMT</pubDate>  
      <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701717_mediaitem65699884.jpg"/>  
      <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701718_mediaitem65699884.jpg"/> 
    </item>
</root>

#4


0  

I think the code has problems:

我认为代码有问题:

    for ($i = 0; $i < $link->length; $i++) {
        $childnode['name'] = $link->item($i)->nodeName;
        $childnode['value'] = $link->item($i)->nodeValue;
        $value[$childnode['name']] = $childnode['value'];
    } 

Each time $childnode['name'] assigned by new value by for loop and in the last when $i equals to the length of $link.length then this value will assigned to $childnode array. So to reduce the problem it should be a multidimensional array like

每次$ childnode ['name']由for循环的新值分配,而当$ i等于$ link.length的长度时,则该值将分配给$ childnode数组。所以为了减少问题,它应该是一个多维数组

for ($i = 0; $i < $link->length; $i++) {
    $childnode['name'][$i] = $link->item($i)->nodeName;
    $childnode['value'][$i] = $link->item($i)->nodeValue;
    $value[$childnode['name'][$i]][$i] = $childnode['value'];
}

To test it: print_r($childnode);

测试它:print_r($ childnode);

#1


2  

You seem to be looping over the 'item' nodes only and, as other people mentioned, are overwriting the previous value on each iteration.

您似乎只在“项目”节点上循环,并且正如其他人提到的那样,在每次迭代时都会覆盖先前的值。

If your debug the $value array using print_r($value) inside the loop;

如果在循环内使用print_r($ value)调试$ value数组;

$dom->load($url);
$link = $dom->getElementsByTagName($tag_name);
$value = array();

for ($i = 0; $i < $link->length; $i++) {
    $childnode['name'] = $link->item($i)->nodeName;
    $childnode['value'] = $link->item($i)->nodeValue;
    $value[$childnode['name']] = $childnode['value'];

    echo 'iteration: ' . $i . '<br />';
    echo '<pre>'; print_r($value); echo '</pre>';
}

You'll probably see something like this

你可能会看到这样的东西

// iteration: 0
Array
(
    [item] => Troggs singer Reg Presley dies at 71 ......
)

// iteration: 1
Array
(
    [item] => Horsemeat found at Newry cold store .........
)

// iteration: 2
Array
(
    [item] => US 'will sue' Standard & Poor's .........
)

What you should be doing is this:

你应该做的是这样的:

$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->load($url);
$items = $dom->getElementsByTagName($tag_name);
$values = array();

foreach ($items as $item) {
    $itemProperties = array();

    // Loop through the 'sub' items 
    foreach ($item->childNodes as $child) {
        // Note: using 'localName' to remove the namespace
        if (isset($itemProperties[(string) $child->localName])) {
            // Quickfix to support multiple 'thumbnails' per item (although they have no content)
            $itemProperties[$child->localName] = (array) $itemProperties[$child->localName];
            $itemProperties[$child->localName][] = $child->nodeValue;
        } else {
            $itemProperties[$child->localName] = $child->nodeValue;
        }
    }

    // Append the item to the 'values' array
    $values[] = $itemProperties;

}


// Output the result
echo '<pre>'; print_r($values); echo '</pre>';

Which outputs:

Array
(
    [0] => Array
        (
            [title] => Troggs singer Reg Presley dies at 71
            [description] => Reg Presley, the lead singer of British rock band The Troggs, whose hits in the 1960s included Wild Thing, has died aged 71.
            [link] => http://www.bbc.co.uk/news/uk-21332048#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa
            [guid] => http://www.bbc.co.uk/news/uk-21332048
            [pubDate] => Tue, 05 Feb 2013 01:13:07 GMT
            [thumbnail] => Array
                (
                    [0] => 
                    [1] => 
                )

        )

    [1] => Array
        (
            [title] => Horsemeat found at Newry cold store
            [description] => Horse DNA has been found in frozen meat in a cold store in Northern Ireland, as Irish police investigate a third case of contamination.
            [link] => http://www.bbc.co.uk/news/world-europe-21331208#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa
            [guid] => http://www.bbc.co.uk/news/world-europe-21331208
            [pubDate] => Mon, 04 Feb 2013 23:47:38 GMT
            [thumbnail] => Array
                (
                    [0] => 
                    [1] => 
                )

        )

    [2] => Array
        (
            [title] => US 'will sue' Standard & Poor's
            [description] => Standard & Poor's says it is to be sued by the US government over the credit ratings agency's assessment of mortgage bonds before the financial crisis.
            [link] => http://www.bbc.co.uk/news/21331018#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa
            [guid] => http://www.bbc.co.uk/news/21331018
            [pubDate] => Mon, 04 Feb 2013 22:45:52 GMT
            [thumbnail] => Array
                (
                    [0] => 
                    [1] => 
                )

        )

)

#2


2  

(Don't forget the root node.) It looks like one of the methods is just concatenating all of the text nodes under that element together (just about equivalent to a xsl:value-of select=.). I've never done much with the DOMDocument class and related classes in PHP. But what you can do is canonicalize the DOMNode using the C14N() method, and then parse the resulting string. It isn't pretty, but it gets the result you want and is easily extensible:

(不要忘记根节点。)看起来其中一个方法就是将该元素下的所有文本节点连接在一起(几乎相当于一个xsl:value-of select =。)。我从未在PHP中使用DOMDocument类和相关类做过多。但是你可以做的是使用C14N()方法规范化DOMNode,然后解析生成的字符串。它不漂亮,但它可以获得您想要的结果并且易于扩展:

    $tag_name = 'item';
    $link = $dom->getElementsByTagName($tag_name);
    for ($i = 0; $i < $link->length; $i++) {
        $treeAsString = $link->item($i)->C14N();
        $curBranchParts = explode("\n",$treeAsString);
        $curBranchPartsSize = count($curBranchParts);
        $curBranchParts = explode("\n",$treeAsString);
        $curBranchPartsSize = count($curBranchParts);
        for ($j = 1; $j < ($curBranchPartsSize - 1); $j++) { 
            $curItem = $curBranchParts[$j];
            $curItemParts = explode('<', $curItem);
            $tagWithContent = $curItemParts[1];
            $tagWithContentParts = explode('>',$tagWithContent);
            $tag = $tagWithContentParts[0];
            $content = $tagWithContentParts[1];

            if (trim($content) != '') echo $tag . ' :- ' . $content . '<br />';
            else echo $tag . '<br />';   
        }
    }

#3


1  

Your problem is that your source XML needs to have a root node (it can be called whatever you want). To be valid XML, you always need a root node. That is, every valid XML file will have exactly one element that has no parent or sibling. Once you have the root node, then your XML will load into your object.

您的问题是您的源XML需要有一个根节点(可以随意调用它)。要成为有效的XML,您始终需要一个根节点。也就是说,每个有效的XML文件都只有一个没有父元素或兄弟元素的元素。获得根节点后,您的XML将加载到您的对象中。

For example:

<root>
    <item> 
      <title>Troggs singer Reg Presley dies at 71</title>  
      <description>Reg Presley, the lead singer of British rock band The Troggs, whose hits in the 1960s included Wild Thing, has died aged 71.</description>  
      <link>http://www.bbc.co.uk/news/uk-21332048#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <guid isPermaLink="false">http://www.bbc.co.uk/news/uk-21332048</guid>  
      <pubDate>Tue, 05 Feb 2013 01:13:07 GMT</pubDate>  
      <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701366_65701359.jpg"/>  
      <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701387_65701359.jpg"/> 
    </item>  
    <item> 
      <title>Horsemeat found at Newry cold store</title>  
      <description>Horse DNA has been found in frozen meat in a cold store in Northern Ireland, as Irish police investigate a third case of contamination.</description>  
      <link>http://www.bbc.co.uk/news/world-europe-21331208#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <guid isPermaLink="false">http://www.bbc.co.uk/news/world-europe-21331208</guid>  
      <pubDate>Mon, 04 Feb 2013 23:47:38 GMT</pubDate>  
      <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65700000/jpg/_65700000_002950295-1.jpg"/>  
      <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65700000/jpg/_65700001_002950295-1.jpg"/> 
    </item>  
    <item> 
      <title>US 'will sue' Standard &amp; Poor's</title>  
      <description>Standard &amp; Poor's says it is to be sued by the US government over the credit ratings agency's assessment of mortgage bonds before the financial crisis.</description>  
      <link>http://www.bbc.co.uk/news/21331018#sa-ns_mchannel=rss&amp;ns_source=PublicRSS20-sa</link>  
      <guid isPermaLink="false">http://www.bbc.co.uk/news/21331018</guid>  
      <pubDate>Mon, 04 Feb 2013 22:45:52 GMT</pubDate>  
      <media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701717_mediaitem65699884.jpg"/>  
      <media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/65701000/jpg/_65701718_mediaitem65699884.jpg"/> 
    </item>
</root>

#4


0  

I think the code has problems:

我认为代码有问题:

    for ($i = 0; $i < $link->length; $i++) {
        $childnode['name'] = $link->item($i)->nodeName;
        $childnode['value'] = $link->item($i)->nodeValue;
        $value[$childnode['name']] = $childnode['value'];
    } 

Each time $childnode['name'] assigned by new value by for loop and in the last when $i equals to the length of $link.length then this value will assigned to $childnode array. So to reduce the problem it should be a multidimensional array like

每次$ childnode ['name']由for循环的新值分配,而当$ i等于$ link.length的长度时,则该值将分配给$ childnode数组。所以为了减少问题,它应该是一个多维数组

for ($i = 0; $i < $link->length; $i++) {
    $childnode['name'][$i] = $link->item($i)->nodeName;
    $childnode['value'][$i] = $link->item($i)->nodeValue;
    $value[$childnode['name'][$i]][$i] = $childnode['value'];
}

To test it: print_r($childnode);

测试它:print_r($ childnode);