如果编码不同于UTF-8,如何在python中读取xml文件

时间:2022-01-03 10:31:47

I am trying to read an xml file with different encoding. In case of UTF-8 it is working fine but in case of other formats like GB18030 or BIG5, it is throwing error like multi-byte encoding are not supported.

我试图读取具有不同编码的xml文件。在UTF-8的情况下,它工作正常但是在GB18030或BIG5等其他格式的情况下,它不会像多字节编码那样抛出错误。

Please suggest a solution for this. Thanks in advance.

请为此建议一个解决方案。提前致谢。

1 个解决方案

#1


0  

Here a try is open as file object and with ElementTree.fromstring() :

这里尝试作为文件对象打开,并使用ElementTree.fromstring():

import xml.etree.ElementTree as ET

with open('file_name.xml','r') as f:
   ef = ET.fromstring(f.read())

It was worked for me.

这对我有用。

Or you can do is with XMLParser :

或者您可以使用XMLParser:

xmlp = ET.XMLParser(encoding="utf-8")
f = ET.parse('file_name.xml',parser=xmlp)

#1


0  

Here a try is open as file object and with ElementTree.fromstring() :

这里尝试作为文件对象打开,并使用ElementTree.fromstring():

import xml.etree.ElementTree as ET

with open('file_name.xml','r') as f:
   ef = ET.fromstring(f.read())

It was worked for me.

这对我有用。

Or you can do is with XMLParser :

或者您可以使用XMLParser:

xmlp = ET.XMLParser(encoding="utf-8")
f = ET.parse('file_name.xml',parser=xmlp)