I have a dictionary data
where I have stored:
我有一个字典数据,我储存了:
-
key
- ID of an event事件的关键ID。
-
value
- the name of this event, wherevalue
is a UTF-8 string值——此事件的名称,其中值为UTF-8字符串。
Now, I want to write down this map into a json file. I tried with this:
现在,我想把这个映射写进一个json文件。我试着用这个:
with open('events_map.json', 'w') as out_file:
json.dump(data, out_file, indent = 4)
but this gives me the error:
但这给了我一个错误:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xbf in position 0: invalid start byte
UnicodeDecodeError: 'utf8' codec不能解码位置0中的字节0xbf:无效的开始字节。
Now, I also tried with:
现在,我也尝试了:
with io.open('events_map.json', 'w', encoding='utf-8') as out_file:
out_file.write(unicode(json.dumps(data, encoding="utf-8")))
but this raises the same error:
但这也引发了同样的错误:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xbf in position 0: invalid start byte
UnicodeDecodeError: 'utf8' codec不能解码位置0中的字节0xbf:无效的开始字节。
I also tried with:
我也试过用:
with io.open('events_map.json', 'w', encoding='utf-8') as out_file:
out_file.write(unicode(json.dumps(data, encoding="utf-8", ensure_ascii=False)))
but this raises the error:
但这引起了错误:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xbf in position 3114: ordinal not in range(128)
UnicodeDecodeError: 'ascii' codec不能解码位置3114:序数不在范围(128)中的字节0xbf
Any suggestions about how can I solve this problem?
我如何解决这个问题?
EDIT: I believe this is the line that is causing me the problem:
编辑:我相信这是我的问题所在:
> data['142']
'\xbf/ANCT25'
EDIT 2: The data
variable is read from a file. So, after reading it from a file:
编辑2:从文件中读取数据变量。所以,在从一个文件中读到它:
data_file_lines = io.open(file_name, 'r', encoding='utf8').readlines()
I then do:
然后我做:
with io.open('data/events_map.json', 'w', encoding='utf8') as json_file:
json.dump(data, json_file, ensure_ascii=False)
Which gives me the error:
这给了我一个错误:
TypeError: must be unicode, not str
类型错误:必须是unicode,而不是str。
Then, I try to do this with the data dictionary:
然后,我试着用数据字典做这个:
for tuple in sorted_tuples (the `data` variable is initialized by a tuple):
data[str(tuple[1])] = json.dumps(tuple[0], ensure_ascii=False, encoding='utf8')
which is, again, followed by:
也就是说,
with io.open('data/events_map.json', 'w', encoding='utf8') as json_file:
json.dump(data, json_file, ensure_ascii=False)
but again, the same error:
但是同样的错误
TypeError: must be unicode, not str
I get the same error when I use the simple open
function for reading from the file:
当我使用简单的开放函数来读取文件时,我得到了相同的错误:
data_file_lines = open(file_name, "r").readlines()
1 个解决方案
#1
13
The exception is caused by the contents of your data
dictionary, at least one of the keys or values is not UTF-8 encoded.
异常是由数据字典的内容引起的,至少其中一个键或值不是UTF-8编码的。
You'll have to replace this value; either by substituting a value that is UTF-8 encoded, or by decoding it to a unicode
object by decoding just that value with whatever encoding is the correct encoding for that value:
你必须替换这个值;要么用UTF-8编码的值替换,要么通过解码,将其解码为unicode对象,然后用任何编码对该值进行正确编码:
data['142'] = data['142'].decode('latin-1')
to decode that string as a Latin-1-encoded value instead.
将该字符串解码为一个latin -1编码的值。
#1
13
The exception is caused by the contents of your data
dictionary, at least one of the keys or values is not UTF-8 encoded.
异常是由数据字典的内容引起的,至少其中一个键或值不是UTF-8编码的。
You'll have to replace this value; either by substituting a value that is UTF-8 encoded, or by decoding it to a unicode
object by decoding just that value with whatever encoding is the correct encoding for that value:
你必须替换这个值;要么用UTF-8编码的值替换,要么通过解码,将其解码为unicode对象,然后用任何编码对该值进行正确编码:
data['142'] = data['142'].decode('latin-1')
to decode that string as a Latin-1-encoded value instead.
将该字符串解码为一个latin -1编码的值。