为什么我不能在Mac OS X终端的Python解释器中显示一个unicode字符?

时间:2021-10-04 23:08:12

If I try to paste a unicode character such as the middle dot:

如果我尝试粘贴一个unicode字符,比如中间点:

·

·

in my python interpreter it does nothing. I'm using Terminal.app on Mac OS X and when I'm simply in in bash I have no trouble:

在我的python解释器中,它什么都不做。我使用终端。在Mac OS X上的应用程序,当我简单地在bash中,我没有麻烦:

:~$ ·

But in the interpreter:

但在翻译:

:~$ python
Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 

^^ I get nothing, it just ignores that I just pasted the character. If I use the escape \xNN\xNN representation of the middle dot '\xc2\xb7', and try to convert to unicode, trying to show the dot causes the interpreter to throw an error:

^ ^我得到什么,只是忽略我刚才贴这个角色。如果我使用的是中间点'\xc2\xb7'的escape \xNN\xNN表示,并尝试转换成unicode,试图显示点使解释器抛出一个错误:

>>> unicode('\xc2\xb7')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 0: ordinal not in range(128)

I have setup 'utf-8' as my default encoding in sitecustomize.py so:

我在sitecustomize中设置了“utf-8”作为默认编码。py:

>>> sys.getdefaultencoding()
'utf-8'

What gives? It's not the Terminal. It's not Python, what am I doing wrong?!

到底发生了什么事?这不是终点站。不是Python,我做错了什么?!

This question is not related to this question, as that indivdiual is able to paste unicode into his Terminal.

这个问题与这个问题无关,因为indivdiual能够将unicode粘贴到他的终端。

1 个解决方案

#1


6  

unicode('\xc2\xb7') means to decode the byte string in question with the default codec, which is ascii -- and that of course fails (trying to set a different default encoding has never worked well, and in particular doesn't apply to "pasted literals" -- that would require a different setting anyway). You could use instead u'\xc2\xb7', and see:

unicode(“\xc2\xb7”)是指用默认的编解码器(即ascii)对字节字符串进行解码,而这当然失败了(尝试设置不同的默认编码从来没有成功过,尤其是不适用于“粘贴的文字”——无论如何都需要不同的设置)。你可以用u'\xc2\xb7',看看:

>>> print(u'\xc2\xb7')
·

since those are two unicode characters of course. While:

因为这是两个unicode字符。而:

>>> print(u'\uc2b7')
슷

gives you a single unicode character (of some oriental persuasion -- sorry, I'm ignorant about these things). BTW, neither of these is the "middle dot" you were looking for. Maybe you mean

给你一个单一的unicode字符(一些东方的说服——对不起,我对这些东西一无所知)。顺便说一下,这两个都不是你要找的“中间点”。也许你的意思

>>> print('\xc2\xb7'.decode('utf8'))
·

which is the middle dot. BTW, for me (python 2.6.4 from python.org on a Mac Terminal.app):

就是中间的点。顺便说一下,对我来说(python.org上的python 2.6.4在Mac终端上):

>>> print('슷')
슷

which kind of surprised me (I expected an error...!-).

哪一种让我吃惊(我预料会有错误……)。

#1


6  

unicode('\xc2\xb7') means to decode the byte string in question with the default codec, which is ascii -- and that of course fails (trying to set a different default encoding has never worked well, and in particular doesn't apply to "pasted literals" -- that would require a different setting anyway). You could use instead u'\xc2\xb7', and see:

unicode(“\xc2\xb7”)是指用默认的编解码器(即ascii)对字节字符串进行解码,而这当然失败了(尝试设置不同的默认编码从来没有成功过,尤其是不适用于“粘贴的文字”——无论如何都需要不同的设置)。你可以用u'\xc2\xb7',看看:

>>> print(u'\xc2\xb7')
·

since those are two unicode characters of course. While:

因为这是两个unicode字符。而:

>>> print(u'\uc2b7')
슷

gives you a single unicode character (of some oriental persuasion -- sorry, I'm ignorant about these things). BTW, neither of these is the "middle dot" you were looking for. Maybe you mean

给你一个单一的unicode字符(一些东方的说服——对不起,我对这些东西一无所知)。顺便说一下,这两个都不是你要找的“中间点”。也许你的意思

>>> print('\xc2\xb7'.decode('utf8'))
·

which is the middle dot. BTW, for me (python 2.6.4 from python.org on a Mac Terminal.app):

就是中间的点。顺便说一下,对我来说(python.org上的python 2.6.4在Mac终端上):

>>> print('슷')
슷

which kind of surprised me (I expected an error...!-).

哪一种让我吃惊(我预料会有错误……)。