Linux下打开windows中文文本乱码问题

时间:2024-04-25 15:10:26

1. 查看文件的编码方式:file命令

$ file test_file.txt
test_file.txt: ISO- text, with very long lines $ file train_model.py
train_model.py: Python script, UTF- Unicode text executable $ file 接口文档.docx
接口文档.docx: Microsoft Word +

但是file命令不太可靠, 一个gb2312编码的文件被判断为ISO-8859

2. 在gedit用指定的编码方式打开文件

打开gedit, 然后File => Open => 左下角的Character Encoding的右边有个下拉列表,选择Add or Remove... => 出现下图所示的设置界面,从左边列表中选择需要的编码方式添加到右边列表

Linux下打开windows中文文本乱码问题

然后在打开文件的时候选定需要的编码方式即可:

Linux下打开windows中文文本乱码问题

3. 转码: iconv命令

$ iconv -f gb2312 -t utf- test_file.txt -o test_file_utf8.txt
$ file test_file_utf8.txt
test_file_utf8.txt: UTF- Unicode text

iconv命令的用法具体如下:

$ iconv --help
Usage: iconv [OPTION...] [FILE...]
Convert encoding of given files from one encoding to another. Input/Output format specification:
-f, --from-code=NAME encoding of original text
-t, --to-code=NAME encoding for output Information:
-l, --list list all known coded character sets Output control:
-c omit invalid characters from output
-o, --output=FILE output file
-s, --silent suppress warnings
--verbose print progress information -?, --help Give this help list
--usage Give a short usage message
-V, --version Print program version Mandatory or optional arguments to long options are also mandatory or optional
for any corresponding short options. For bug reporting instructions, please see:
<https://bugs.launchpad.net/ubuntu/+source/glibc/+bugs>.

参考:https://www.cnblogs.com/longwaytogo/p/6308703.html