记一次遇到的文件乱码的问题

时间:2023-01-23 04:11:46

问题

近日遇到CentOS下,解压文件后,中文文件名及文本内中文内容均为乱码的问题。经百度后解决,这里记录一下解决过程,以备后用。

解决过程

文件名显示乱码

[titi@mine example]$ ls -l
total 208
-rw-r--r--. 1 titi titi 1427 Sep 18 16:21 10.??????Աʾ??.txt
-rw-r--r--. 1 titi titi 108235 Aug 31 21:18 12.????512????????ѯ.txt
-rw-r--r--. 1 titi titi 882 Sep 30 08:29 13.distinctʾ??.txt
-rw-r--r--. 1 titi titi 3561 Sep 22 17:16 14.??ת??ʾ??.txt
-rw-r--r--. 1 titi titi 499 Sep 7 20:51 15.??IOд????????.txt
-rw-r--r--. 1 titi titi 5036 Sep 30 16:31 16.???ο??Ƶ??벢????_map??.txt
-rw-r--r--. 1 titi titi 19312 Sep 29 12:52 1.ydb_example.sql
-rw-r--r--. 1 titi titi 9346 Sep 21 17:17 2.????ͨ??kafkaʵʱ????????.txt
-rw-r--r--. 1 titi titi 1992 Sep 26 21:24 3.??ʾdemo?.txt
-rw-r--r--. 1 titi titi 2253 Aug 30 13:30 4.???????ݵı?ͨ?????취.txt
-rw-r--r--. 1 titi titi 6344 Sep 18 15:26 8.????????example.txt
-rw-r--r--. 1 titi titi 17303 Sep 3 17:30 9.??????unionʾ??.txt

文件名解码

[titi@mine example]convmv -f gb2312 -t utf8 -r --notest * > /dev/null && ls -l
-rw-r--r--. 1 titi titi 1427 Sep 18 16:21 10.伴随人员示例.txt
-rw-r--r--. 1 titi titi 108235 Aug 31 21:18 12.超过512个条件查询.txt
-rw-r--r--. 1 titi titi 882 Sep 30 08:29 13.distinct示例.txt
-rw-r--r--. 1 titi titi 3561 Sep 22 17:16 14.行转列示例.txt
-rw-r--r--. 1 titi titi 499 Sep 7 20:51 15.高IO写入的配置.txt
-rw-r--r--. 1 titi titi 5036 Sep 30 16:31 16.如何控制导入并发数_map数.txt
-rw-r--r--. 1 titi titi 19312 Sep 29 12:52 1.ydb_example.sql
-rw-r--r--. 1 titi titi 9346 Sep 21 17:17 2.如何通过kafka实时导入数据.txt
-rw-r--r--. 1 titi titi 1992 Sep 26 21:24 3.演示demo搭建.txt
-rw-r--r--. 1 titi titi 2253 Aug 30 13:30 4.导出数据的变通解决办法.txt
-rw-r--r--. 1 titi titi 6344 Sep 18 15:26 8.多表关联example.txt
-rw-r--r--. 1 titi titi 17303 Sep 3 17:30 9.多分区union示例.txt

文件内容解码

解码前

[titi@mine example]$ file *
10.伴随人员示例.txt: ISO-8859 text, with CRLF line terminators
12.超过512个条件查询.txt: Non-ISO extended-ASCII text, with very long lines, with CRLF line terminators
13.distinct示例.txt: ISO-8859 text, with CRLF line terminators
14.行转列示例.txt: ISO-8859 text, with CRLF line terminators
15.高IO写入的配置.txt: ISO-8859 text, with CRLF line terminators
16.如何控制导入并发数_map数.txt: ISO-8859 text, with CRLF line terminators
1.ydb_example.sql: ISO-8859 text, with CRLF line terminators
2.如何通过kafka实时导入数据.txt: UTF-8 Unicode (with BOM) text, with very long lines, with CRLF line terminators
3.演示demo搭建.txt: ISO-8859 text, with CRLF line terminators
4.导出数据的变通解决办法.txt: UTF-8 Unicode text, with CRLF line terminators
8.多表关联example.txt: ISO-8859 text, with CRLF line terminators
9.多分区union示例.txt: UTF-8 Unicode C program text, with very long lines, with CRLF line terminators

解码命令

[titi@mine example]$ file -F ' ' * | grep ISO-8859 | awk '{print $1}' | xargs -ixxx  iconv -f gb2312 -t utf8 xxx -o xxx

解码后

[titi@mine example]$ file *
10.伴随人员示例.txt: UTF-8 Unicode text, with CRLF line terminators
12.超过512个条件查询.txt: Non-ISO extended-ASCII text, with very long lines, with CRLF line terminators
13.distinct示例.txt: UTF-8 Unicode text, with CRLF line terminators
14.行转列示例.txt: UTF-8 Unicode text, with CRLF line terminators
15.高IO写入的配置.txt: UTF-8 Unicode text, with CRLF line terminators
16.如何控制导入并发数_map数.txt: UTF-8 Unicode text, with CRLF line terminators
1.ydb_example.sql: UTF-8 Unicode text, with CRLF line terminators
2.如何通过kafka实时导入数据.txt: UTF-8 Unicode (with BOM) text, with very long lines, with CRLF line terminators
3.演示demo搭建.txt: UTF-8 Unicode text, with CRLF line terminators
4.导出数据的变通解决办法.txt: UTF-8 Unicode text, with CRLF line terminators
8.多表关联example.txt: UTF-8 Unicode text, with CRLF line terminators
9.多分区union示例.txt: UTF-8 Unicode C program text, with very long lines, with CRLF line terminators