I18n和密码不是US-ASCII,Latin1或Win1252

时间:2022-02-24 11:01:24

How do you handle passwords for services when the user enters something that is best represented in Unicode or some other non-Latin character encoding?

当用户输入最能用Unicode或其他非拉丁字符编码表示的内容时,如何处理服务密码?

Specifically, can you use a Cyrillic password as a password to Oracle? What do you do to verify a user's password against a Windows authentication mechanism if the password is provided as UTF-8?

具体来说,您可以使用西里尔语密码作为Oracle的密码吗?如果密码是以UTF-8提供的,您如何根据Windows身份验证机制验证用户的密码?

I have some ideas on how this should be handled in our code, but I'm looking for advice from others to make sure our direction is sound.

我对如何在代码中处理这个问题有一些想法,但我正在寻求其他人的建议,以确保我们的方向是合理的。

2 个解决方案

#1


1  

The encoding itself should not pose a problem on the encryption, most algorithms operate on bytes, not on characters. The only thing that could be a problem is: Encrypting the same password with different encodings could yield different values if exotic (non-ASCII) characters are used in the password. Converting the password to a fixed encoding (like UTF8) should solve that problem, though.

编码本身不应该对加密造成问题,大多数算法对字节进行操作,而不是对字符进行操作。唯一可能出现问题的是:如果在密码中使用了外来(非ASCII)字符,则使用不同编码加密相同的密码可能会产生不同的值。但是,将密码转换为固定编码(如UTF8)可以解决这个问题。

#2


1  

You might have problems with the authentication mechanisms length restrictions.

您可能在身份验证机制长度限制方面遇到问题。

e.g. If the system specifies a max length of 12 bytes, this could easily be exceeded by five chinese characters in utf-8, this is not a problem as such because four chinese characters should have enough entropy, but, you need to be careful about error messsages.

例如如果系统指定最大长度为12个字节,utf-8中的五个汉字很容易超过这个,这不是问题,因为四个汉字应该有足够的熵,但是,你需要注意错误形式交往。

Other problems may arise if the authentication mechnism enforces rules like "at least one each of upper case, lower case, punctuation and numeric characters" - several languages have no upper/lower case characters, and there are dozens characters defined in unicode that a native speaker would think of as numbers but may not be recognised as such by a poorly implemented rule.

如果身份验证机制强制执行诸如“大写,小写,标点符号和数字字符中的至少一个”之类的规则,则可能会出现其他问题 - 几种语言没有大写/小写字符,并且在unicode中定义了几十个字符,即本机发言者会将其视为数字,但可能不会被执行不当的规则所承认。

#1


1  

The encoding itself should not pose a problem on the encryption, most algorithms operate on bytes, not on characters. The only thing that could be a problem is: Encrypting the same password with different encodings could yield different values if exotic (non-ASCII) characters are used in the password. Converting the password to a fixed encoding (like UTF8) should solve that problem, though.

编码本身不应该对加密造成问题,大多数算法对字节进行操作,而不是对字符进行操作。唯一可能出现问题的是:如果在密码中使用了外来(非ASCII)字符,则使用不同编码加密相同的密码可能会产生不同的值。但是,将密码转换为固定编码(如UTF8)可以解决这个问题。

#2


1  

You might have problems with the authentication mechanisms length restrictions.

您可能在身份验证机制长度限制方面遇到问题。

e.g. If the system specifies a max length of 12 bytes, this could easily be exceeded by five chinese characters in utf-8, this is not a problem as such because four chinese characters should have enough entropy, but, you need to be careful about error messsages.

例如如果系统指定最大长度为12个字节,utf-8中的五个汉字很容易超过这个,这不是问题,因为四个汉字应该有足够的熵,但是,你需要注意错误形式交往。

Other problems may arise if the authentication mechnism enforces rules like "at least one each of upper case, lower case, punctuation and numeric characters" - several languages have no upper/lower case characters, and there are dozens characters defined in unicode that a native speaker would think of as numbers but may not be recognised as such by a poorly implemented rule.

如果身份验证机制强制执行诸如“大写,小写,标点符号和数字字符中的至少一个”之类的规则,则可能会出现其他问题 - 几种语言没有大写/小写字符,并且在unicode中定义了几十个字符,即本机发言者会将其视为数字,但可能不会被执行不当的规则所承认。