今天突然发现了一个lxml的坑。
假设我们有一个节点
<id>123</id>
有两个父节点都要用上述节点,则必须把上面的节点写两遍!用同一个会出错!
出错例子:
#!/usr/bin/env python
#encoding:utf8 from lxml import etree if __name__ == "__main__": root1 = etree.Element("root1") #根节点1
root2 = etree.Element("root2") #根节点2
ver_node = etree.Element("id") #子节点
ver_node.text = "" root1.append(ver_node) #都加入了同一个子节点
root2.append(ver_node) print etree.tostring(root1, pretty_print=True, xml_declaration=True, encoding='UTF-8')
print etree.tostring(root2, pretty_print=True, xml_declaration=True, encoding='UTF-8')
结果:
<?xml version='1.0' encoding='UTF-8'?>
<root1/> <?xml version='1.0' encoding='UTF-8'?>
<root2>
<id>123</id>
</root2>
只有后面一个有子节点,前面一个没有!
正确写法:
#!/usr/bin/env python
#encoding:utf8
from lxml import etreeimport copy if __name__ == "__main__":
root1 = etree.Element("root1")
root2 = etree.Element("root2")
ver_node1 = etree.Element("id")
ver_node1.text = ""
ver_node2 = copy.deepcopy(ver_node1) #深拷贝! root1.append(ver_node1)
root2.append(ver_node2) print etree.tostring(root1, pretty_print=True, xml_declaration=True, encoding='UTF-8')
print etree.tostring(root2, pretty_print=True, xml_declaration=True, encoding='UTF-8')
结果:
<?xml version='1.0' encoding='UTF-8'?>
<root1>
<id>123</id>
</root1> <?xml version='1.0' encoding='UTF-8'?>
<root2>
<id>123</id>
</root2>