UnicodeDecodeError: 'utf-8' codec can't decode byte 0xce in position 22: invalid continuation byte

时间:2023-03-08 19:28:12

在使用python读取文本文件,一般会这样写:

# -*- coding:utf-8 -*-
f = open("train.txt", "r", encoding='utf-8')
txt = f.read()

但是我这样写编译器却报了个错误:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xce in position 22: invalid continuation byte

然后我用notepad++打开文件,发现这个文本是GB2312编码的,但是我却要用UTF-8打开,真是罪过……

后来用notepad++把文本转成UTF-8编码就读取成功了:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xce in position 22: invalid continuation byte