如何从XML标记中获取元素以存储到PHP数组中?

时间:2021-09-03 15:43:04

I search for a few days means, in PHP, to store in a array PHP, with my XML structures given below, the list of "URI" associated with their respective "SurfaceForm".

我在PHP中搜索了几天意味着存储在数组PHP中,我的XML结构如下所示,“URI”列表与它们各自的“SurfaceForm”相关联。

Example of structure sought:

寻求的结构示例:

[0] => "House Judiciary Committee" 
       [URI] =>"http://dbpedia.org/resource/United_States_House_Committee_on_the_Judiciary"

My XML (= $documentXML):

我的XML(= $ documentXML):

<?xml version="1.0" encoding="utf-8"?>
<Annotation text="The chairman of the House Judiciary Committee claims he obtained a “smoking gun” email that proves the Obama Justice Department prevented settlement payouts from going to conservative-leaning organizations, even as liberal groups were awarded money and DOJ officials denied “picking and choosing” recipients." confidence="0.35" support="0" types="" sparql="" policy="whitelist">
<Resources>
    <Resource URI="http://dbpedia.org/resource/United_States_House_Committee_on_the_Judiciary" support="805" types="" surfaceForm="House Judiciary Committee" offset="20" similarityScore="0.9999998407862576" percentageOfSecondRank="1.592137605044176E-7"/>
    <Resource URI="http://dbpedia.org/resource/Smoking_gun" support="28" types="" surfaceForm="smoking gun" offset="68" similarityScore="0.9942367821210603" percentageOfSecondRank="0.005796625092271743"/>
    <Resource URI="http://dbpedia.org/resource/Email" support="4760" types="" surfaceForm="email" offset="81" similarityScore="0.9998441228208889" percentageOfSecondRank="1.294724256058748E-4"/>
    <Resource URI="http://dbpedia.org/resource/Barack_Obama" support="14695" types="DBpedia:Agent,Schema:Person,Http://xmlns.com/foaf/0.1/Person,DBpedia:Person,DBpedia:OfficeHolder" surfaceForm="Obama" offset="103" similarityScore="0.99366844805149" percentageOfSecondRank="0.006369900896618705"/>
    <Resource URI="http://dbpedia.org/resource/United_States_Department_of_Justice" support="4536" types="DBpedia:Agent,Schema:Organization,DBpedia:Organisation,Schema:GovernmentOrganization,DBpedia:GovernmentAgency" surfaceForm="Justice Department" offset="109" similarityScore="0.9999993701077011" percentageOfSecondRank="5.355039299447172E-7"/>
    <Resource URI="http://dbpedia.org/resource/Settlement_(litigation)" support="727" types="" surfaceForm="settlement" offset="138" similarityScore="0.984034181518778" percentageOfSecondRank="0.010762752853438246"/>
    <Resource URI="http://dbpedia.org/resource/Liberalism" support="6101" types="" surfaceForm="liberal" offset="215" similarityScore="0.42136818389223374" percentageOfSecondRank="0.7737018249713564"/>
    <Resource URI="http://dbpedia.org/resource/United_States_Department_of_Justice" support="4536" types="DBpedia:Agent,Schema:Organization,DBpedia:Organisation,Schema:GovernmentOrganization,DBpedia:GovernmentAgency" surfaceForm="DOJ" offset="253" similarityScore="1.0" percentageOfSecondRank="0.0"/>
</Resources>
</Annotation>

My tests in PHP (which doesn't work):

我在PHP中的测试(不起作用):

$dom = new DOMDocument();
$dom->loadXML($documentXML);
$xpath = new DOMXPath($dom);

$bs = $documentXML
    ->getElementsByTagName('Annotation/Ressources/Ressource[URI]');

$arrayI = array();
for ($i = 0; $i<sizeof($arrayI); $i++) {
    foreach($bs as $URI){
        $arrayI[$i] = $URI->nodeValue . "\n";
    }
}

Thank you for your help :)

感谢您的帮助 :)

1 个解决方案

#1


2  

DOMElement::getElementsByTagName() does not support Xpath, just tag names. DOMXpath::evaluate() is the method you need.

DOMElement :: getElementsByTagName()不支持Xpath,只支持标记名称。 DOMXpath :: evaluate()是您需要的方法。

$dom = new DOMDocument();
$dom->loadXML($documentXML);
$xpath = new DOMXPath($dom);

$result = [];
foreach ($xpath->evaluate('Annotation/Resources/Resource') as $resourceNode) {
  $result[$resourceNode->getAttribute('surfaceForm')][] = 
    $resourceNode->getAttribute('URI');
}

var_dump($result);

You need to iterate the Resource element nodes, then read their attributes and build up the array structure as needed.

您需要迭代Resource元素节点,然后读取它们的属性并根据需要构建数组结构。

#1


2  

DOMElement::getElementsByTagName() does not support Xpath, just tag names. DOMXpath::evaluate() is the method you need.

DOMElement :: getElementsByTagName()不支持Xpath,只支持标记名称。 DOMXpath :: evaluate()是您需要的方法。

$dom = new DOMDocument();
$dom->loadXML($documentXML);
$xpath = new DOMXPath($dom);

$result = [];
foreach ($xpath->evaluate('Annotation/Resources/Resource') as $resourceNode) {
  $result[$resourceNode->getAttribute('surfaceForm')][] = 
    $resourceNode->getAttribute('URI');
}

var_dump($result);

You need to iterate the Resource element nodes, then read their attributes and build up the array structure as needed.

您需要迭代Resource元素节点,然后读取它们的属性并根据需要构建数组结构。