如何将XML中具有相同名称的多个节点转换为R中的df / list?

时间:2021-08-01 16:53:57

This is my first experence with XML using R. So my question probably will sound very naive if not silly... I downloaded an XML file in the pattern of

这是我使用R的第一次使用XML的经历。所以我的问题可能听起来很天真,如果不是愚蠢的...我下载了一个XML文件的模式

<experiment>
  <sampleattribute>
    <category>AGE</category>
    <value>8</value>
    <value>10</value>
    <value>11</value>
  </sampleattribute>
  <sampleattribute>
    <category>SEX</category>
    <value>female</value>
    <value>male</value>
  </sampleattribute>
</experiment>
<experiment>
  <sampleattribute>
    <category>DESIGN</category>
    <value>control</value>
    <value>disease</value>
  </sampleattribute>
</experiment>
<experiment>
  <sampleattribute>
    <category>AGE</category>
    <value>8</value>
    <value>10</value>
    <value>11</value>
  </sampleattribute>
  <sampleattribute>
    <category>SEX</category>
    <value>female</value>
  </sampleattribute>
  <sampleattribute>
    <category>DESIGN</category>
    <value>control</value>
    <value>disease</value>
  </sampleattribute>
</experiment>

As you can see, each nodes has different . I want to concatenate all the sampleattribute in each in a way that it will be converted into a dataframe eventually.

如您所见,每个节点都有不同。我希望以最终将它转换为数据帧的方式连接每个中的所有sampleattribute。

I have tried attr<- xpathSApply(myxml, "//experiment/sampleattribute"), but have no way to track back which has which sampleattirbutes.

我已经尝试过attr < - xpathSApply(myxml,“// experiment / sampleattribute”),但无法追踪哪个样本有哪些。

Thanks very much for any suggestions.

非常感谢任何建议。

1 个解决方案

#1


0  

You can't get a dataframe with such XML but rather a list.

您无法使用此类XML获取数据框,而是获取列表。

Using XML package you can do the this for example:

使用XML包,您可以执行以下操作:

doc = htmlParse(txt,asText=TRUE)

res = lapply(xpathSApply(doc,'//experiment'),
       function(x){
         category = xpathSApply(x,'sampleattribute/category',xmlValue)
         values = xpathSApply(x,'sampleattribute/value',xmlValue)
         list(category=category,
              values =values)
       })

Then you can inspect you result :

然后你可以检查你的结果:

 str(res)
 List of 3
 $ :List of 2
  ..$ category: chr [1:2] "AGE" "SEX"
  ..$ values  : chr [1:5] "8" "10" "11" "female" ...
 $ :List of 2
  ..$ category: chr "DESIGN"
  ..$ values  : chr [1:2] "control" "disease"
 $ :List of 2
  ..$ category: chr [1:3] "AGE" "SEX" "DESIGN"
  ..$ values  : chr [1:6] "8" "10" "11" "female" ..

#1


0  

You can't get a dataframe with such XML but rather a list.

您无法使用此类XML获取数据框,而是获取列表。

Using XML package you can do the this for example:

使用XML包,您可以执行以下操作:

doc = htmlParse(txt,asText=TRUE)

res = lapply(xpathSApply(doc,'//experiment'),
       function(x){
         category = xpathSApply(x,'sampleattribute/category',xmlValue)
         values = xpathSApply(x,'sampleattribute/value',xmlValue)
         list(category=category,
              values =values)
       })

Then you can inspect you result :

然后你可以检查你的结果:

 str(res)
 List of 3
 $ :List of 2
  ..$ category: chr [1:2] "AGE" "SEX"
  ..$ values  : chr [1:5] "8" "10" "11" "female" ...
 $ :List of 2
  ..$ category: chr "DESIGN"
  ..$ values  : chr [1:2] "control" "disease"
 $ :List of 2
  ..$ category: chr [1:3] "AGE" "SEX" "DESIGN"
  ..$ values  : chr [1:6] "8" "10" "11" "female" ..