使用Nokogiri解析XML文件?

时间:2022-05-25 20:35:02
<DataSet xmlns="http://www.atcomp.cz/webservices">
  <xs:schema xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" id="file_mame">...</xs:schema>
  <diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
    <alldata xmlns="">
      <category diffgr:id="category1" msdata:rowOrder="0">
        <category_code>P.../category_code>
        <category_name>...</category_name>
        <subcategory diffgr:id="subcategory1" msdata:rowOrder="0">
          <category_code>...</category_code>
          <subcategory_code>...</subcategory_code>
          <subcategory_name>...</subcategory_name>
        </subcategory>
....

How can I obtain all categories and subcategories data?

如何获取所有类别和子类别数据?

I am trying something like:

我正在尝试这样的事情:

reader.xpath('//DataSet/diffgr:diffgram/alldata').each do |node|

But this gives me:

但这给了我:

undefined method `xpath' for #<Nokogiri::XML::Reader:0x000001021d1750>

1 个解决方案

#1


4  

Nokogiri's Reader parser does not support XPath. Try using Nokogiri's in-memory Document parser instead.

Nokogiri的Reader解析器不支持XPath。尝试使用Nokogiri的内存文档解析器。

On another note, to query xpath namespaces, you need to provide a namespace mapping, like this:

另外需要注意,要查询xpath名称空间,需要提供名称空间映射,如下所示:

doc = Nokogiri::XML(my_document_string_or_io)

namespaces = { 
  'default' => 'http://www.atcomp.cz/webservices', 
  'diffgr' => 'urn:schemas-microsoft-com:xml-diffgram-v1' 
}
doc.xpath('//default:DataSet/diffgr:diffgram/alldata', namespaces).each do |node|
  # ...
end

Or you can remove the namespaces:

或者您可以删除命名空间:

doc.remove_namespaces!
doc.xpath('//DataSet/diffgram/alldata').each { |node|  }

#1


4  

Nokogiri's Reader parser does not support XPath. Try using Nokogiri's in-memory Document parser instead.

Nokogiri的Reader解析器不支持XPath。尝试使用Nokogiri的内存文档解析器。

On another note, to query xpath namespaces, you need to provide a namespace mapping, like this:

另外需要注意,要查询xpath名称空间,需要提供名称空间映射,如下所示:

doc = Nokogiri::XML(my_document_string_or_io)

namespaces = { 
  'default' => 'http://www.atcomp.cz/webservices', 
  'diffgr' => 'urn:schemas-microsoft-com:xml-diffgram-v1' 
}
doc.xpath('//default:DataSet/diffgr:diffgram/alldata', namespaces).each do |node|
  # ...
end

Or you can remove the namespaces:

或者您可以删除命名空间:

doc.remove_namespaces!
doc.xpath('//DataSet/diffgram/alldata').each { |node|  }