PyMySQL警告:(1366,“不正确的字符串值:'\\xF0\\ x98\\x8D……”)

时间:2023-02-05 10:44:04

I'm attempting to import data (tweets and other twitter text information) into a database using Pandas and MySQL. I received the following error:

我尝试用熊猫和MySQL将数据(tweets和其他twitter文本信息)导入数据库。我收到了以下错误:

166: Warning: (1366, "Incorrect string value: '\xF0\x9F\x92\x9C\xF0\x9F...' for column 'text' at row 3") result = self._query(query)

166:警告:(1366,“不正确的字符串值:\xF0\x9F\ x9C\xF0\x9F……”“对于第3行”的列“文本”)结果= self。_query(查询)

166: Warning: (1366, "Incorrect string value: '\xF0\x9F\x98\x8D t...' for column 'text' at row 5") result = self._query(query)

166:警告:(1366,“不正确的字符串值:\xF0\x9F\x98\x8D……”“对于第5行”的列“text”)结果= self。_query(查询)

After a thorough search it seems as if there's something wrong in the way my database columns are set up. I've tried setting the database charset to UTF8 and collating it to utf_unicode_ci but I still receive the same error.

经过彻底的搜索之后,似乎我的数据库列设置的方式有问题。我尝试将数据库charset设置为UTF8,并将其与utf_unicode_ci进行排序,但仍然会收到相同的错误。

The following is the code that imports the data to the database:

以下是将数据导入数据库的代码:

#To create connection and write table into MySQL

engine = create_engine("mysql+pymysql://{user}:{pw}@{lh}/{db}?charset=utf8"
                       .format(user="user",
                               pw="pass",
                               db="blahDB",
                               lh="bla.com/aald/"))

df.to_sql(con=engine, name='US_tweets', if_exists='replace')

The data I'm importing consist of the following data types: 'int64', 'object' and 'datetime64[ns]'. I found out these data types by printing the data to the console with

我导入的数据包括以下数据类型:int64、object和datetime64[ns]。通过将数据打印到控制台,我发现了这些数据类型。

print(df['tweett']) >>> returns dtype 'object'

I'd appreciate any help, thanks!

我很感激你的帮助,谢谢!

1 个解决方案

#1


0  

You need utf8mb4, not utf8, when connecting to MySQL and in the columns involved.

当连接到MySQL和相关列时,需要utf8mb4,而不是utf8。

More python tips: http://mysql.rjweb.org/doc.php/charcoll#python (Except use utf8mb4 in place of utf8. UTF-8 should not be changed.)

更多的python提示:http://mysql.rjweb.org/doc.php/charcoll#python(除了utf8的使用utf8mb4)。UTF-8不应改变。

#1


0  

You need utf8mb4, not utf8, when connecting to MySQL and in the columns involved.

当连接到MySQL和相关列时,需要utf8mb4,而不是utf8。

More python tips: http://mysql.rjweb.org/doc.php/charcoll#python (Except use utf8mb4 in place of utf8. UTF-8 should not be changed.)

更多的python提示:http://mysql.rjweb.org/doc.php/charcoll#python(除了utf8的使用utf8mb4)。UTF-8不应改变。