将utf-8编码的文本装载到MySQL表中。

时间:2023-01-05 21:07:42

I have a large CSV file that I am going to load it into a MySQL table. However, these data are encoded into utf-8 format, because they include some non-english characters. I have already set the character set of the corresponding column in the table to utf-8. But when I load my file. the non-english characters turn into weird characters(when I do a select on my table rows). Do I need to encode my data before I load the into the table? if yes how Can I do this. I am using Python to load the data and using LOAD DATA LOCAL INFILE command. thanks

我有一个很大的CSV文件,我要把它加载到MySQL表中。但是,这些数据被编码成utf-8格式,因为它们包含一些非英语字符。我已经将表中相应列的字符集设置为utf-8。但当我加载文件时。非英语字符变成了奇怪的字符(当我在我的表行中进行选择时)。我需要在载入表格前对数据进行编码吗?如果是,我怎么做。我使用Python来加载数据并使用load data LOCAL INFILE命令。谢谢

4 个解决方案

#1


14  

as said in http://dev.mysql.com/doc/refman/5.1/en/load-data.html, you can specify the charset used by your CSV file with the "CHARACTER SET" optional parameter of LOAD DATA LOCAL INFILE

如在http://dev.mysql.com/doc/refman/5.1/en/load-data.html中所述,您可以使用“字符集”可选参数来指定CSV文件所使用的字符集。

#2


65  

Try

试一试

LOAD DATA INFILE 'file'
IGNORE INTO TABLE table
CHARACTER SET UTF8
FIELDS TERMINATED BY ';'
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'

#3


2  

You should send

你应该发送

init_command = 'SET NAMES UTF8'
use_unicode = True
charset = 'utf8'

when doing MySQLdb.connect() e.g.

如在MySQLdb.connect()。

dbconfig = {}
dbconfig['host']            = 'localhost'
dbconfig['user']            = ''
dbconfig['passwd']          = ''
dbconfig['db']              = ''
dbconfig['init_command']    = 'SET NAMES UTF8'
dbconfig['use_unicode']     = True
dbconfig['charset']         = 'utf8'

conn = MySQLdb.connect(**dbconfig)

edit: ah, sorry, I see you've added that you're using "LOAD DATA LOCAL INFILE" -- this wasn't clear from your initial question :)

编辑:啊,对不起,我看到你已经添加了你正在使用“加载数据本地文件”——这在你最初的问题上是不清楚的:

#4


2  

Do not need encode your characters in the file, but you need to make sure that your file is encoding at UTF-8 before load this file to database.

不需要在文件中编码您的字符,但是您需要确保在将此文件加载到数据库之前,您的文件是在UTF-8编码的。

#1


14  

as said in http://dev.mysql.com/doc/refman/5.1/en/load-data.html, you can specify the charset used by your CSV file with the "CHARACTER SET" optional parameter of LOAD DATA LOCAL INFILE

如在http://dev.mysql.com/doc/refman/5.1/en/load-data.html中所述,您可以使用“字符集”可选参数来指定CSV文件所使用的字符集。

#2


65  

Try

试一试

LOAD DATA INFILE 'file'
IGNORE INTO TABLE table
CHARACTER SET UTF8
FIELDS TERMINATED BY ';'
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'

#3


2  

You should send

你应该发送

init_command = 'SET NAMES UTF8'
use_unicode = True
charset = 'utf8'

when doing MySQLdb.connect() e.g.

如在MySQLdb.connect()。

dbconfig = {}
dbconfig['host']            = 'localhost'
dbconfig['user']            = ''
dbconfig['passwd']          = ''
dbconfig['db']              = ''
dbconfig['init_command']    = 'SET NAMES UTF8'
dbconfig['use_unicode']     = True
dbconfig['charset']         = 'utf8'

conn = MySQLdb.connect(**dbconfig)

edit: ah, sorry, I see you've added that you're using "LOAD DATA LOCAL INFILE" -- this wasn't clear from your initial question :)

编辑:啊,对不起,我看到你已经添加了你正在使用“加载数据本地文件”——这在你最初的问题上是不清楚的:

#4


2  

Do not need encode your characters in the file, but you need to make sure that your file is encoding at UTF-8 before load this file to database.

不需要在文件中编码您的字符,但是您需要确保在将此文件加载到数据库之前,您的文件是在UTF-8编码的。