如何设置MySQL默认的utf8排序到utf8_unicode_ci?

时间:2022-02-24 11:01:42

I'm converting a database to the utf8 character set and utf8_unicode_ci collation. When altering a table's character set to utf8, MySQL automatically converts the columns of the table to the default collation for utf8: utf_general_ci. I don't want to run hundreds of alter column commands to convert every column to utf8_unicode_ci, so can I set the default collation for utf8 to utf8_unicode_ci, as shown in information_schema?:

我将一个数据库转换为utf8字符集和utf8_unicode_ci排序。当将一个表的字符集更改为utf8时,MySQL会自动将表的列转换为utf8: utf_general_ci的默认排序。我不希望运行数百个alter column命令将每个列转换为utf8_unicode_ci,因此,我可以设置utf8到utf8_unicode_ci的默认排序,如information_schema中所示?

SELECT * FROM information_schema.COLLATIONS WHERE CHARACTER_SET_NAME = 'utf8';

+---------------------------+--------------------+-----+------------+-------------+---------+
| COLLATION_NAME            | CHARACTER_SET_NAME | ID  | IS_DEFAULT | IS_COMPILED | SORTLEN |
+---------------------------+--------------------+-----+------------+-------------+---------+
| utf8_general_ci           | utf8               |  33 | Yes        | Yes         |       1 |
| utf8_bin                  | utf8               |  83 |            | Yes         |       1 |
| utf8_unicode_ci           | utf8               | 192 |            | Yes         |       8 |
| utf8_icelandic_ci         | utf8               | 193 |            | Yes         |       8 |
| utf8_latvian_ci           | utf8               | 194 |            | Yes         |       8 |
| utf8_romanian_ci          | utf8               | 195 |            | Yes         |       8 |
| utf8_slovenian_ci         | utf8               | 196 |            | Yes         |       8 |
| utf8_polish_ci            | utf8               | 197 |            | Yes         |       8 |
| utf8_estonian_ci          | utf8               | 198 |            | Yes         |       8 |
| utf8_spanish_ci           | utf8               | 199 |            | Yes         |       8 |
| utf8_swedish_ci           | utf8               | 200 |            | Yes         |       8 |
| utf8_turkish_ci           | utf8               | 201 |            | Yes         |       8 |
| utf8_czech_ci             | utf8               | 202 |            | Yes         |       8 |
| utf8_danish_ci            | utf8               | 203 |            | Yes         |       8 |
| utf8_lithuanian_ci        | utf8               | 204 |            | Yes         |       8 |
| utf8_slovak_ci            | utf8               | 205 |            | Yes         |       8 |
| utf8_spanish2_ci          | utf8               | 206 |            | Yes         |       8 |
| utf8_roman_ci             | utf8               | 207 |            | Yes         |       8 |
| utf8_persian_ci           | utf8               | 208 |            | Yes         |       8 |
| utf8_esperanto_ci         | utf8               | 209 |            | Yes         |       8 |
| utf8_hungarian_ci         | utf8               | 210 |            | Yes         |       8 |
| utf8_sinhala_ci           | utf8               | 211 |            | Yes         |       8 |
| utf8_german2_ci           | utf8               | 212 |            | Yes         |       8 |
| utf8_croatian_mysql561_ci | utf8               | 213 |            | Yes         |       8 |
| utf8_unicode_520_ci       | utf8               | 214 |            | Yes         |       8 |
| utf8_vietnamese_ci        | utf8               | 215 |            | Yes         |       8 |
| utf8_general_mysql500_ci  | utf8               | 223 |            | Yes         |       1 |
| utf8_croatian_ci          | utf8               | 576 |            | Yes         |       8 |
| utf8_myanmar_ci           | utf8               | 577 |            | Yes         |       8 |
+---------------------------+--------------------+-----+------------+-------------+---------+

Note the IS_DEFAULT column.

注意IS_DEFAULT列。

Please also note that I'm not asking how to convert a database, table or column using ALTER!

请注意,我没有询问如何使用ALTER来转换数据库、表或列!

Additionally adding collation_server = utf8_unicode_ci to my.cnf does not work.

另外添加collation_server = utf8_unicode_ci到my.cnf不工作。

2 个解决方案

#1


1  

Need one ALTER per table, not per column (Reference):

每表需要一个修改,而不是每列(引用):

ALTER TABLE foo CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;

You can generate all the alters, then manually copy them to execute them. Something like

您可以生成所有的alters,然后手动复制它们以执行它们。类似的

SELECT CONCAT("ALTER TABLE ", table_schema, ".", table_name,
              " CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
       ")
    FROM information_schema.tables
    WHERE table_schema NOT IN ('mysql', 'information_schema',
                               'performance_schema', 'sys_schema');

But I suggest you CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_520_ci so that you can handle all of Chinese, plus Emoji.

但是我建议您转换为字符集utf8mb4 COLLATE utf8mb4_unicode _520_ ci,这样您就可以处理所有的中文和表情符号。

I hope you did CONVERT TO, not just MODIFY COLUMN. The former converts the characters; the latter will make a mess of any 8-bit characters already in the table.

我希望你能转换成,而不仅仅是修改列。前者转换角色;后者将把表中已经存在的8位字符弄得一团糟。

One gotcha with utf8mb4 happens if you have indexes on VARCHAR(255). If practical, shrink the size to 191 or less.

如果在VARCHAR(255)上有索引,就会出现一个带有utf8mb4的gotcha。如果实用,将尺寸缩小到191或更小。

Example

例子

mysql> SHOW CREATE TABLE iidr\G
*************************** 1. row ***************************
       Table: iidr
Create Table: CREATE TABLE `iidr` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `key2` int(10) unsigned NOT NULL,
  `vc` varchar(99) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `key2` (`key2`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql> SHOW FULL COLUMNS FROM iidr;
+-------+------------------+-----------------+------+-----+---------+----------------+---------------------------------+---------+
| Field | Type             | Collation       | Null | Key | Default | Extra          | Privileges                      | Comment |
+-------+------------------+-----------------+------+-----+---------+----------------+---------------------------------+---------+
| id    | int(10) unsigned | NULL            | NO   | PRI | NULL    | auto_increment | select,insert,update,references |         |
| key2  | int(10) unsigned | NULL            | NO   | UNI | NULL    |                | select,insert,update,references |         |
| vc    | varchar(99)      | utf8_general_ci | YES  |     | NULL    |                | select,insert,update,references |         |
+-------+------------------+-----------------+------+-----+---------+----------------+---------------------------------+---------+
3 rows in set (0.00 sec)

mysql> ALTER TABLE iidr CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_520_ci;
Query OK, 2 rows affected (0.14 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SHOW FULL COLUMNS FROM iidr;
+-------+------------------+------------------------+------+-----+---------+----------------+---------------------------------+---------+
| Field | Type             | Collation              | Null | Key | Default | Extra          | Privileges                      | Comment |
+-------+------------------+------------------------+------+-----+---------+----------------+---------------------------------+---------+
| id    | int(10) unsigned | NULL                   | NO   | PRI | NULL    | auto_increment | select,insert,update,references |         |
| key2  | int(10) unsigned | NULL                   | NO   | UNI | NULL    |                | select,insert,update,references |         |
| vc    | varchar(99)      | utf8mb4_unicode_520_ci | YES  |     | NULL    |                | select,insert,update,references |         |
+-------+------------------+------------------------+------+-----+---------+----------------+---------------------------------+---------+
3 rows in set (0.00 sec)

mysql> SHOW CREATE TABLE iidr\G
*************************** 1. row ***************************
       Table: iidr
Create Table: CREATE TABLE `iidr` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `key2` int(10) unsigned NOT NULL,
  `vc` varchar(99) COLLATE utf8mb4_unicode_520_ci DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `key2` (`key2`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_520_ci
1 row in set (0.00 sec)

#2


0  

I would suggest you create a database with the collation you want and run a script to copy all the tables and data (in a staging server, I suggest you to do this in staging server NOT IN THE PROD), check the staging if everything works then do it in the PROD.

我建议你创建一个数据库的排序和运行一个脚本复制所有的表和数据(在交付准备服务器,我建议你这样做在交付准备服务器不刺激),检查分段刺激如果一切正常,那么做。

#1


1  

Need one ALTER per table, not per column (Reference):

每表需要一个修改,而不是每列(引用):

ALTER TABLE foo CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;

You can generate all the alters, then manually copy them to execute them. Something like

您可以生成所有的alters,然后手动复制它们以执行它们。类似的

SELECT CONCAT("ALTER TABLE ", table_schema, ".", table_name,
              " CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
       ")
    FROM information_schema.tables
    WHERE table_schema NOT IN ('mysql', 'information_schema',
                               'performance_schema', 'sys_schema');

But I suggest you CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_520_ci so that you can handle all of Chinese, plus Emoji.

但是我建议您转换为字符集utf8mb4 COLLATE utf8mb4_unicode _520_ ci,这样您就可以处理所有的中文和表情符号。

I hope you did CONVERT TO, not just MODIFY COLUMN. The former converts the characters; the latter will make a mess of any 8-bit characters already in the table.

我希望你能转换成,而不仅仅是修改列。前者转换角色;后者将把表中已经存在的8位字符弄得一团糟。

One gotcha with utf8mb4 happens if you have indexes on VARCHAR(255). If practical, shrink the size to 191 or less.

如果在VARCHAR(255)上有索引,就会出现一个带有utf8mb4的gotcha。如果实用,将尺寸缩小到191或更小。

Example

例子

mysql> SHOW CREATE TABLE iidr\G
*************************** 1. row ***************************
       Table: iidr
Create Table: CREATE TABLE `iidr` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `key2` int(10) unsigned NOT NULL,
  `vc` varchar(99) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `key2` (`key2`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql> SHOW FULL COLUMNS FROM iidr;
+-------+------------------+-----------------+------+-----+---------+----------------+---------------------------------+---------+
| Field | Type             | Collation       | Null | Key | Default | Extra          | Privileges                      | Comment |
+-------+------------------+-----------------+------+-----+---------+----------------+---------------------------------+---------+
| id    | int(10) unsigned | NULL            | NO   | PRI | NULL    | auto_increment | select,insert,update,references |         |
| key2  | int(10) unsigned | NULL            | NO   | UNI | NULL    |                | select,insert,update,references |         |
| vc    | varchar(99)      | utf8_general_ci | YES  |     | NULL    |                | select,insert,update,references |         |
+-------+------------------+-----------------+------+-----+---------+----------------+---------------------------------+---------+
3 rows in set (0.00 sec)

mysql> ALTER TABLE iidr CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_520_ci;
Query OK, 2 rows affected (0.14 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SHOW FULL COLUMNS FROM iidr;
+-------+------------------+------------------------+------+-----+---------+----------------+---------------------------------+---------+
| Field | Type             | Collation              | Null | Key | Default | Extra          | Privileges                      | Comment |
+-------+------------------+------------------------+------+-----+---------+----------------+---------------------------------+---------+
| id    | int(10) unsigned | NULL                   | NO   | PRI | NULL    | auto_increment | select,insert,update,references |         |
| key2  | int(10) unsigned | NULL                   | NO   | UNI | NULL    |                | select,insert,update,references |         |
| vc    | varchar(99)      | utf8mb4_unicode_520_ci | YES  |     | NULL    |                | select,insert,update,references |         |
+-------+------------------+------------------------+------+-----+---------+----------------+---------------------------------+---------+
3 rows in set (0.00 sec)

mysql> SHOW CREATE TABLE iidr\G
*************************** 1. row ***************************
       Table: iidr
Create Table: CREATE TABLE `iidr` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `key2` int(10) unsigned NOT NULL,
  `vc` varchar(99) COLLATE utf8mb4_unicode_520_ci DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `key2` (`key2`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_520_ci
1 row in set (0.00 sec)

#2


0  

I would suggest you create a database with the collation you want and run a script to copy all the tables and data (in a staging server, I suggest you to do this in staging server NOT IN THE PROD), check the staging if everything works then do it in the PROD.

我建议你创建一个数据库的排序和运行一个脚本复制所有的表和数据(在交付准备服务器,我建议你这样做在交付准备服务器不刺激),检查分段刺激如果一切正常,那么做。