什么是SQL Server的MySQL unicode_ci整理模拟?

时间:2022-05-17 23:27:46

As far as I understand, in MySQL unicode_ci (utf8_unicode_ci in particular) collations are meant to support all the characters regardless to locale.

据我所知,在MySQL unicode_ci(特别是utf8_unicode_ci)中,排序规则是为了支持所有字符而不管语言环境。

I need to achieve the same with SQL Server 2008 R2. My database is going to contain data in very different languages (not limited to latin-based alphabets). I am not going to use non-Unicode strings at all. What collation should I chose?

我需要使用SQL Server 2008 R2实现相同的功能。我的数据库将包含非常不同语言的数据(不限于基于拉丁语的字母表)。我根本不会使用非Unicode字符串。我应该选择什么样的整理?

1 个解决方案

#1


7  

You might as well go with Latin1_General_CI_AI

你可以选择Latin1_General_CI_AI

The reason is that unicode data is stored using NVarchar fields, SQL Server is more flexible in that it can mix Varchar (1-byte) and NVarchar (2-byte) data. So to match UTF8, any collation would do. As for CI - every single collation in 2008 allows for the CI specification to be added (it is a checkbox in the UI "case sensitive" - unchecked for insensitive).

原因是使用NVarchar字段存储unicode数据,SQL Server更灵活,因为它可以混合Varchar(1字节)和NVarchar(2字节)数据。所以为了匹配UTF8,任何整理都可以。对于CI - 2008年的每一个排序规则都允许添加CI规范(它是UI中的一个复选框“区分大小写” - 未选中不敏感)。

The last bit and some others like width are just additional tuning on SQL Server.

最后一点和其他一些像宽度只是SQL Server的额外调整。

Point #2 from http://forums.mysql.com/read.php?103,187048,188748

点#2来自http://forums.mysql.com/read.php?103,187048,188748

utf8_unicode_ci is fine for all these languages: Russian, Bulgarian, Belarusian, Macedonian, Serbian, and Ukrainian.

utf8_unicode_ci适用于所有这些语言:俄语,保加利亚语,白俄罗斯语,马其顿语,塞尔维亚语和乌克兰语。

If you require sorting for a particular language, where languages handle accents differently, you need a specific dictionary order - refer here http://msdn.microsoft.com/en-us/library/ms144250.aspx. Otherwise Latin1_General is based on Latin-US

如果您需要对特定语言进行排序,语言处理重音的方式不同,则需要特定的字典顺序 - 请参阅http://msdn.microsoft.com/en-us/library/ms144250.aspx。否则Latin1_General基于拉丁美洲

#1


7  

You might as well go with Latin1_General_CI_AI

你可以选择Latin1_General_CI_AI

The reason is that unicode data is stored using NVarchar fields, SQL Server is more flexible in that it can mix Varchar (1-byte) and NVarchar (2-byte) data. So to match UTF8, any collation would do. As for CI - every single collation in 2008 allows for the CI specification to be added (it is a checkbox in the UI "case sensitive" - unchecked for insensitive).

原因是使用NVarchar字段存储unicode数据,SQL Server更灵活,因为它可以混合Varchar(1字节)和NVarchar(2字节)数据。所以为了匹配UTF8,任何整理都可以。对于CI - 2008年的每一个排序规则都允许添加CI规范(它是UI中的一个复选框“区分大小写” - 未选中不敏感)。

The last bit and some others like width are just additional tuning on SQL Server.

最后一点和其他一些像宽度只是SQL Server的额外调整。

Point #2 from http://forums.mysql.com/read.php?103,187048,188748

点#2来自http://forums.mysql.com/read.php?103,187048,188748

utf8_unicode_ci is fine for all these languages: Russian, Bulgarian, Belarusian, Macedonian, Serbian, and Ukrainian.

utf8_unicode_ci适用于所有这些语言:俄语,保加利亚语,白俄罗斯语,马其顿语,塞尔维亚语和乌克兰语。

If you require sorting for a particular language, where languages handle accents differently, you need a specific dictionary order - refer here http://msdn.microsoft.com/en-us/library/ms144250.aspx. Otherwise Latin1_General is based on Latin-US

如果您需要对特定语言进行排序,语言处理重音的方式不同,则需要特定的字典顺序 - 请参阅http://msdn.microsoft.com/en-us/library/ms144250.aspx。否则Latin1_General基于拉丁美洲