原文档为txt,编码为:gbk

在读取时,用下面写法会报异常:

with open("test.txt", 'r', encoding='gbk') as f:

异常信息:UnicodeDecodeError: 'gbk' codec can't decode byte 0xba in position 17: illegal multibyte sequence

添加一个ignore,会出现中文乱码:

with open("test.txt", 'r', encoding='gbk', errors='ignore') as f:

最终解决方案:

1.查询本项目编码:

print(sys.getdefaultencoding())

得出自己的编码,我的项目编码是:utf-8,把代码做如下改动,完美解决乱码:

with open("test.txt", 'r', encoding='utf-8') as f:

更多推荐

python读取txt文档乱码解决