如何将Nokogiri文档对象转换为JSON

时间:2022-10-30 14:17:12

I have some parsed Nokogiri::XML::Document objects that I want to print as JSON.

我有一些解析后的Nokogiri::XML::文档对象,我想打印为JSON。

I can go the route of making it a string, parsing it into a hash, with active-record or Crack and then Hash.to_json; but that is both ugly and depending on way too manay libraries.

我可以把它变成一个字符串,把它解析成一个散列,用active-record或Crack,然后是Hash.to_json;但这也很难看,而且依赖于太多的manay图书馆。

Is there not a simpler way?

有没有更简单的方法?

As per request in the comment, for example the XML <root a="b"><a>b</a></root> could be represented as JSON:

根据注释中的请求,例如XML b 可以表示为JSON:

<root a="b"><a>b</a></root> #=> {"root":{"a":"b"}}
<root foo="bar"><a>b</a></root> #=> {"root":{"a":"b","foo":"bar"}}

That is what I get with Crack now too. And, sure, collisions between entities and child-tags are a potential problem, but I build most of the XML myself, so it is easiest for me to avoid these collisions alltogether :)

这也是我现在用Crack能得到的。当然,实体和子标记之间的冲突是一个潜在的问题,但大部分XML都是我自己构建的,所以我最容易避免这些冲突:

2 个解决方案

#1


12  

Here's one way to do it. As noted by my comment, the 'right' answer depends on what your output should be. There is no canonical representation of XML nodes in JSON, and hence no such capability is built into the libraries involved:

这里有一个方法。正如我的评论所指出的,“正确”的答案取决于你的输出应该是什么。JSON中没有XML节点的规范表示,因此在涉及的库中没有这种功能:

require 'nokogiri'
require 'json'
class Nokogiri::XML::Node
  def to_json(*a)
    {"$name"=>name}.tap do |h|
      kids = children.to_a
      h.merge!(attributes)
      h.merge!("$text"=>text) unless text.empty?
      h.merge!("$kids"=>kids) unless kids.empty?
    end.to_json(*a)
  end
end
class Nokogiri::XML::Document
  def to_json(*a); root.to_json(*a); end
end
class Nokogiri::XML::Text
  def to_json(*a); text.to_json(*a); end
end
class Nokogiri::XML::Attr
  def to_json(*a); value.to_json(*a); end
end

xml = Nokogiri::XML '<root a="b" xmlns:z="zzz">
  <z:a>Hello <b z:x="y">World</b>!</z:a>
</root>'
puts xml.to_json
{
  "$name":"root",
  "a":"b",
  "$text":"Hello World!",
  "$kids":[
    {
      "$name":"a",
      "$text":"Hello World!",
      "$kids":[
        "Hello ",
        {
          "$name":"b",
          "x":"y",
          "$text":"World",
          "$kids":[
            "World"
          ]
        },
        "!"
      ]
    }
  ]
}

Note that the above completely ignores namespaces, which may or may not be what you want.

注意,上面完全忽略了名称空间,这可能是您想要的,也可能不是。


Converting to JsonML

Here's another alternative that converts to JsonML. While this is a lossy conversion (it does not support comment nodes, DTDs, or namespace URLs) and the format is a little bit "goofy" by design (the first child element is at [1] or [2] depending on whether or not attributes are present), it does indicate namespace prefixes for elements and attributes:

这里是另一个转换为JsonML的替代方案。虽然这是有损的转换(dtd,它不支持注释节点或名称空间url)和格式有点“高飞”设计(第一个子元素是在[1]和[2]根据属性是否存在),它表示名称空间前缀的元素和属性:

require 'nokogiri'
require 'json'
class Nokogiri::XML::Node
  def namespaced_name
    "#{namespace && "#{namespace.prefix}:"}#{name}"
  end
end
class Nokogiri::XML::Element
  def to_json(*a)
    [namespaced_name].tap do |parts|
      unless attributes.empty?
        parts << Hash[ attribute_nodes.map{ |a| [a.namespaced_name,a.value] } ]
      end
      parts.concat(children.select{|n| n.text? ? (n.text=~/\S/) : n.element? })
    end.to_json(*a)
  end
end
class Nokogiri::XML::Document
  def to_json(*a); root.to_json(*a); end
end
class Nokogiri::XML::Text
  def to_json(*a); text.to_json(*a); end
end
class Nokogiri::XML::Attr
  def to_json(*a); value.to_json(*a); end
end

xml = Nokogiri::XML '<root a="b" xmlns:z="zzz">
  <z:a>Hello <b z:x="y">World</b>!</z:a>
</root>'
puts xml.to_json
#=> ["root",{"a":"b"},["z:a","Hello ",["b",{"z:x":"y"},"World"],"!"]]

#2


39  

This works for me:

这工作对我来说:

Hash.from_xml(@nokogiri_object.to_xml).to_json

This method is using active support , So if you are not using rails then include active support core extensions manually

这个方法使用的是活动支持,所以如果您不使用rails,那么就手动包含活动支持核心扩展

#1


12  

Here's one way to do it. As noted by my comment, the 'right' answer depends on what your output should be. There is no canonical representation of XML nodes in JSON, and hence no such capability is built into the libraries involved:

这里有一个方法。正如我的评论所指出的,“正确”的答案取决于你的输出应该是什么。JSON中没有XML节点的规范表示,因此在涉及的库中没有这种功能:

require 'nokogiri'
require 'json'
class Nokogiri::XML::Node
  def to_json(*a)
    {"$name"=>name}.tap do |h|
      kids = children.to_a
      h.merge!(attributes)
      h.merge!("$text"=>text) unless text.empty?
      h.merge!("$kids"=>kids) unless kids.empty?
    end.to_json(*a)
  end
end
class Nokogiri::XML::Document
  def to_json(*a); root.to_json(*a); end
end
class Nokogiri::XML::Text
  def to_json(*a); text.to_json(*a); end
end
class Nokogiri::XML::Attr
  def to_json(*a); value.to_json(*a); end
end

xml = Nokogiri::XML '<root a="b" xmlns:z="zzz">
  <z:a>Hello <b z:x="y">World</b>!</z:a>
</root>'
puts xml.to_json
{
  "$name":"root",
  "a":"b",
  "$text":"Hello World!",
  "$kids":[
    {
      "$name":"a",
      "$text":"Hello World!",
      "$kids":[
        "Hello ",
        {
          "$name":"b",
          "x":"y",
          "$text":"World",
          "$kids":[
            "World"
          ]
        },
        "!"
      ]
    }
  ]
}

Note that the above completely ignores namespaces, which may or may not be what you want.

注意,上面完全忽略了名称空间,这可能是您想要的,也可能不是。


Converting to JsonML

Here's another alternative that converts to JsonML. While this is a lossy conversion (it does not support comment nodes, DTDs, or namespace URLs) and the format is a little bit "goofy" by design (the first child element is at [1] or [2] depending on whether or not attributes are present), it does indicate namespace prefixes for elements and attributes:

这里是另一个转换为JsonML的替代方案。虽然这是有损的转换(dtd,它不支持注释节点或名称空间url)和格式有点“高飞”设计(第一个子元素是在[1]和[2]根据属性是否存在),它表示名称空间前缀的元素和属性:

require 'nokogiri'
require 'json'
class Nokogiri::XML::Node
  def namespaced_name
    "#{namespace && "#{namespace.prefix}:"}#{name}"
  end
end
class Nokogiri::XML::Element
  def to_json(*a)
    [namespaced_name].tap do |parts|
      unless attributes.empty?
        parts << Hash[ attribute_nodes.map{ |a| [a.namespaced_name,a.value] } ]
      end
      parts.concat(children.select{|n| n.text? ? (n.text=~/\S/) : n.element? })
    end.to_json(*a)
  end
end
class Nokogiri::XML::Document
  def to_json(*a); root.to_json(*a); end
end
class Nokogiri::XML::Text
  def to_json(*a); text.to_json(*a); end
end
class Nokogiri::XML::Attr
  def to_json(*a); value.to_json(*a); end
end

xml = Nokogiri::XML '<root a="b" xmlns:z="zzz">
  <z:a>Hello <b z:x="y">World</b>!</z:a>
</root>'
puts xml.to_json
#=> ["root",{"a":"b"},["z:a","Hello ",["b",{"z:x":"y"},"World"],"!"]]

#2


39  

This works for me:

这工作对我来说:

Hash.from_xml(@nokogiri_object.to_xml).to_json

This method is using active support , So if you are not using rails then include active support core extensions manually

这个方法使用的是活动支持,所以如果您不使用rails,那么就手动包含活动支持核心扩展