如何在MySql中进行口音敏感搜索

时间:2023-01-20 19:28:57

I have a MySQL table with utf8 general ci collation. In the table, I can see two entries:

我有一个带有utf8通用ci排序的MySQL表。在表格中,我可以看到两个条目:

abad
abád

很糟很糟

I am using a query that looks like this:

我使用的查询是这样的:

SELECT *  FROM `words` WHERE `word` = 'abád'

The query result gives both words:

查询结果给出两个单词:

abad
abád

很糟很糟

Is there a way to indicate that I only want MySQL to find the accented word? I want the query to only return

有没有一种方法可以表示我只想让MySQL找到重音单词?我希望查询只返回

abád

很糟

I have also tried this query:

我也尝试过这个查询:

SELECT *  FROM `words` WHERE BINARY `word` = 'abád'

It gives me no results. Thank you for the help.

没有结果。谢谢你的帮助。

10 个解决方案

#1


76  

If your searches on that field are always going to be accent-sensitive, then declare the collation of the field as utf8_bin (that'll compare for equality the utf8-encoded bytes) or use a language specific collation that distinguish between the accented and un-accented characters.

如果在该字段上的搜索始终是对口音敏感的,那么将该字段的排序声明为utf8_bin(这将比较utf8编码的字节是否相等),或者使用语言特定的排序,以区分重音字符和非重音字符。

col_name varchar(10) collate utf8_bin

If searches are normally accent-insensitive, but you want to make an exception for this search, try;

如果搜索通常是不带口音的,但是您想要为这个搜索破例,请尝试;

WHERE col_name = 'abád' collate utf8_bin

#2


11  

In my version (MySql 5.0), there is not available any utf8 charset collate for case insensitive, accent sensitive searches. The only accent sensitive collate for utf8 is utf8_bin. However it is also case sensitive.

在我的版本(MySql 5.0)中,对于不区分大小写、重音敏感的搜索,没有任何utf8字符集排序。utf8唯一敏感的重音排序是utf8_bin。然而,它也是区分大小写的。

My work around has been to use something like this:

我的工作就是这样:

SELECT * FROM `words` WHERE LOWER(column) = LOWER('aBád') COLLATE utf8_bin

#3


3  

The MySQL bug, for future reference, is http://bugs.mysql.com/bug.php?id=19567.

作为将来的参考,MySQL bug是http://bugs.mysql.com/bug.php?id=19567。

#4


0  

SELECT *  FROM `words` WHERE column = 'abád' collate latin1_General_CS 

(or your collation including cs)

(或你的整理,包括cs)

#5


0  

You can try searching for the hex variable of the character, HEX() within mysql and use a similar function within your programming language and match these. This worked well for me when i was doing a listing where a person could select the first letter of a person.

您可以尝试在mysql中搜索字符的十六进制变量hex(),并在编程语言中使用类似的函数并匹配它们。当我在做一个人可以选择一个人的第一个字母的列表时,这对我来说很有效。

#6


0  

Well, you just described what utf8_general_ci collation is all about (a, á, à, â, ä, å all equals to a in comparison).

您刚才描述了utf8_general_ci排序的所有内容(a, a, a, a, a, a, a, a都等于a)。

There have also been changes in MySQL server 5.1 in regards to utf8_general_ci and utf8_unicode_ci so it's server version dependent too. Better check the docs.

MySQL server 5.1也有关于utf8_general_ci和utf8_unicode_ci的更改,因此它也依赖于服务器版本。更好的检查文档。

So, If it's MySQL server 5.0 I'd go for utf8_unicode_ci instead of utf8_general_ci which is obviously wrong for your use-case.

所以,如果是MySQL服务器5.0,我就选择utf8_unicode_ci而不是utf8_general_ci,这显然是错误的。

#7


0  

I was getting the same error.

我也犯了同样的错误。

I've changed the collation of my table to utf8_bin (through phpMyAdmin) and the problem was solved.

我将表的排序规则更改为utf8_bin(通过phpMyAdmin),问题就解决了。

Hope it helps! :)

希望它可以帮助!:)

#8


0  

Check to see if the database table collation type end with "_ci", This stands for case insensitive...

检查数据库表排序类型是否以“_ci”结尾,这表示不区分大小写…

Change it to collation the the same or nearest name without the "_ci" ...

将其更改为没有“_ci”的相同或最近名称的排序。

For example... change "utf8_general_ci" to "utf8_bin" Mke

例如……将“utf8_general_ci”改为“utf8_bin”Mke。

#9


0  

Accepted answer is good, but beware that you may have to use COLLATE utf8mb4_bin instead!

公认的答案是好的,但是要注意,您可能必须使用COLLATE utf8mb4_bin !

WHERE col_name = 'abád' collate utf8mb4_bin

Above fixes errors like:

以上修复错误:

MySQL said: Documentation 1253 - COLLATION 'utf8_bin' is not valid for CHARACTER SET 'utf8mb4'

MySQL说:文档1253 -排序“utf8_bin”不适用于字符集“utf8mb4”

#10


0  

That works for me for an accent insensitive and case insensitive search in MySql server 5.1 in a database in utf8_general_ci, where column is a LONGBLOB.

这适用于utf8_general_ci数据库中的MySql server 5.1中的重音不敏感和大小写不敏感搜索,其中的列是一个长blob。

select * from words where '%word%' LIKE column collate utf8_unicode_ci

with

select * from words where'%word%' LIKE column collate utf8_general_ci

the result is case sensitive but not accent sensitive.

结果是区分大小写,但不区分重音。

#1


76  

If your searches on that field are always going to be accent-sensitive, then declare the collation of the field as utf8_bin (that'll compare for equality the utf8-encoded bytes) or use a language specific collation that distinguish between the accented and un-accented characters.

如果在该字段上的搜索始终是对口音敏感的,那么将该字段的排序声明为utf8_bin(这将比较utf8编码的字节是否相等),或者使用语言特定的排序,以区分重音字符和非重音字符。

col_name varchar(10) collate utf8_bin

If searches are normally accent-insensitive, but you want to make an exception for this search, try;

如果搜索通常是不带口音的,但是您想要为这个搜索破例,请尝试;

WHERE col_name = 'abád' collate utf8_bin

#2


11  

In my version (MySql 5.0), there is not available any utf8 charset collate for case insensitive, accent sensitive searches. The only accent sensitive collate for utf8 is utf8_bin. However it is also case sensitive.

在我的版本(MySql 5.0)中,对于不区分大小写、重音敏感的搜索,没有任何utf8字符集排序。utf8唯一敏感的重音排序是utf8_bin。然而,它也是区分大小写的。

My work around has been to use something like this:

我的工作就是这样:

SELECT * FROM `words` WHERE LOWER(column) = LOWER('aBád') COLLATE utf8_bin

#3


3  

The MySQL bug, for future reference, is http://bugs.mysql.com/bug.php?id=19567.

作为将来的参考,MySQL bug是http://bugs.mysql.com/bug.php?id=19567。

#4


0  

SELECT *  FROM `words` WHERE column = 'abád' collate latin1_General_CS 

(or your collation including cs)

(或你的整理,包括cs)

#5


0  

You can try searching for the hex variable of the character, HEX() within mysql and use a similar function within your programming language and match these. This worked well for me when i was doing a listing where a person could select the first letter of a person.

您可以尝试在mysql中搜索字符的十六进制变量hex(),并在编程语言中使用类似的函数并匹配它们。当我在做一个人可以选择一个人的第一个字母的列表时,这对我来说很有效。

#6


0  

Well, you just described what utf8_general_ci collation is all about (a, á, à, â, ä, å all equals to a in comparison).

您刚才描述了utf8_general_ci排序的所有内容(a, a, a, a, a, a, a, a都等于a)。

There have also been changes in MySQL server 5.1 in regards to utf8_general_ci and utf8_unicode_ci so it's server version dependent too. Better check the docs.

MySQL server 5.1也有关于utf8_general_ci和utf8_unicode_ci的更改,因此它也依赖于服务器版本。更好的检查文档。

So, If it's MySQL server 5.0 I'd go for utf8_unicode_ci instead of utf8_general_ci which is obviously wrong for your use-case.

所以,如果是MySQL服务器5.0,我就选择utf8_unicode_ci而不是utf8_general_ci,这显然是错误的。

#7


0  

I was getting the same error.

我也犯了同样的错误。

I've changed the collation of my table to utf8_bin (through phpMyAdmin) and the problem was solved.

我将表的排序规则更改为utf8_bin(通过phpMyAdmin),问题就解决了。

Hope it helps! :)

希望它可以帮助!:)

#8


0  

Check to see if the database table collation type end with "_ci", This stands for case insensitive...

检查数据库表排序类型是否以“_ci”结尾,这表示不区分大小写…

Change it to collation the the same or nearest name without the "_ci" ...

将其更改为没有“_ci”的相同或最近名称的排序。

For example... change "utf8_general_ci" to "utf8_bin" Mke

例如……将“utf8_general_ci”改为“utf8_bin”Mke。

#9


0  

Accepted answer is good, but beware that you may have to use COLLATE utf8mb4_bin instead!

公认的答案是好的,但是要注意,您可能必须使用COLLATE utf8mb4_bin !

WHERE col_name = 'abád' collate utf8mb4_bin

Above fixes errors like:

以上修复错误:

MySQL said: Documentation 1253 - COLLATION 'utf8_bin' is not valid for CHARACTER SET 'utf8mb4'

MySQL说:文档1253 -排序“utf8_bin”不适用于字符集“utf8mb4”

#10


0  

That works for me for an accent insensitive and case insensitive search in MySql server 5.1 in a database in utf8_general_ci, where column is a LONGBLOB.

这适用于utf8_general_ci数据库中的MySql server 5.1中的重音不敏感和大小写不敏感搜索,其中的列是一个长blob。

select * from words where '%word%' LIKE column collate utf8_unicode_ci

with

select * from words where'%word%' LIKE column collate utf8_general_ci

the result is case sensitive but not accent sensitive.

结果是区分大小写,但不区分重音。