在python中将字符串转换为二进制。

时间:2021-01-09 18:25:44

I am in need of a way to get the binary representation of a string in python. e.g.

我需要一种方法来获得python中字符串的二进制表示形式。如。

st = "hello world"
toBinary(st)

Is there a module of some neat way of doing this?

有没有一种简单的方法来做这个?

4 个解决方案

#1


66  

Something like this?

是这样的吗?

>>> st = "hello world"
>>> ' '.join(format(ord(x), 'b') for x in st)
'1101000 1100101 1101100 1101100 1101111 100000 1110111 1101111 1110010 1101100 1100100'

#using `bytearray`
>>> ' '.join(format(x, 'b') for x in bytearray(st))
'1101000 1100101 1101100 1101100 1101111 100000 1110111 1101111 1110010 1101100 1100100'

#2


29  

As a more pythonic way you can first convert your string to byte array then use bin function within map :

作为一种更python化的方式,你可以先将字符串转换为字节数组,然后在map中使用bin函数:

>>> st = "hello world"
>>> map(bin,bytearray(st))
['0b1101000', '0b1100101', '0b1101100', '0b1101100', '0b1101111', '0b100000', '0b1110111', '0b1101111', '0b1110010', '0b1101100', '0b1100100']

Or you can join it:

或者你也可以加入:

>>> ' '.join(map(bin,bytearray(st)))
'0b1101000 0b1100101 0b1101100 0b1101100 0b1101111 0b100000 0b1110111 0b1101111 0b1110010 0b1101100 0b1100100'

Note that in python3 you need to specify an encoding for bytearray function :

请注意,在python3中,您需要为bytearray函数指定一个编码:

>>> ' '.join(map(bin,bytearray(st,'utf8')))
'0b1101000 0b1100101 0b1101100 0b1101100 0b1101111 0b100000 0b1110111 0b1101111 0b1110010 0b1101100 0b1100100'

You can also use binascii module in python 2:

您还可以在python 2中使用binascii模块:

>>> import binascii
>>> bin(int(binascii.hexlify(st),16))
'0b110100001100101011011000110110001101111001000000111011101101111011100100110110001100100'

hexlify return the hexadecimal representation of the binary data then you can convert to int by specifying 16 as its base then convert it to binary with bin.

hexlify返回二进制数据的十六进制表示,然后你可以通过指定16作为它的基数,然后将它转换成二进制的二进制数据。

#3


11  

You can access the code values for the characters in your string using the ord() built-in function. If you then need to format this in binary, the string.format() method will do the job.

可以使用ord()内置函数访问字符串中的字符的代码值。如果您需要将其格式化为二进制文件,那么string.format()方法将完成此任务。

a = "test"
print(' '.join(format(ord(x), 'b') for x in a))

(Thanks to Ashwini Chaudhary for posting that code snippet.)

(感谢Ashwini Chaudhary发布的代码片段。)

While the above code works in Python 3, this matter gets more complicated if you're assuming any encoding other than UTF-8. In Python 2, strings are byte sequences, and ASCII encoding is assumed by default. In Python 3, strings are assumed to be Unicode, and there's a separate bytes type that acts more like a Python 2 string. If you wish to assume any encoding other than UTF-8, you'll need to specify the encoding.

虽然上面的代码在Python 3中工作,但是如果您假设除了UTF-8以外的任何编码,这个问题会变得更加复杂。在Python 2中,字符串是字节序列,默认情况下是ASCII编码。在Python 3中,字符串被假定为Unicode,并且有一个单独的字节类型,它更像Python 2字符串。如果您希望假设除UTF-8之外的任何编码,您将需要指定编码。

In Python 3, then, you can do something like this:

在python3里,你可以这样做:

a = "test"
a_bytes = bytes(a, "ascii")
print(' '.join(["{0:b}".format(x) for x in a_bytes]))

The differences between UTF-8 and ascii encoding won't be obvious for simple alphanumeric strings, but will become important if you're processing text that includes characters not in the ascii character set.

对于简单的字母数字字符串,UTF-8和ascii编码之间的区别并不明显,但是如果您处理的是包含字符而不是ascii字符集的文本,则会变得很重要。

#4


0  

This is an update for the existing answers which used bytearray() and can not work that way anymore:

这是对现有的答案的更新,它使用了bytearray(),不能再这样工作了:

>>> st = "hello world"
>>> map(bin, bytearray(st))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: string argument without an encoding

Because, as explained in the link above, if the source is a string, you must also give the encoding:

因为,正如上面的链接所解释的,如果源是一个字符串,那么您还必须给出编码:

>>> map(bin, bytearray(st, encoding='utf-8'))
<map object at 0x7f14dfb1ff28>

#1


66  

Something like this?

是这样的吗?

>>> st = "hello world"
>>> ' '.join(format(ord(x), 'b') for x in st)
'1101000 1100101 1101100 1101100 1101111 100000 1110111 1101111 1110010 1101100 1100100'

#using `bytearray`
>>> ' '.join(format(x, 'b') for x in bytearray(st))
'1101000 1100101 1101100 1101100 1101111 100000 1110111 1101111 1110010 1101100 1100100'

#2


29  

As a more pythonic way you can first convert your string to byte array then use bin function within map :

作为一种更python化的方式,你可以先将字符串转换为字节数组,然后在map中使用bin函数:

>>> st = "hello world"
>>> map(bin,bytearray(st))
['0b1101000', '0b1100101', '0b1101100', '0b1101100', '0b1101111', '0b100000', '0b1110111', '0b1101111', '0b1110010', '0b1101100', '0b1100100']

Or you can join it:

或者你也可以加入:

>>> ' '.join(map(bin,bytearray(st)))
'0b1101000 0b1100101 0b1101100 0b1101100 0b1101111 0b100000 0b1110111 0b1101111 0b1110010 0b1101100 0b1100100'

Note that in python3 you need to specify an encoding for bytearray function :

请注意,在python3中,您需要为bytearray函数指定一个编码:

>>> ' '.join(map(bin,bytearray(st,'utf8')))
'0b1101000 0b1100101 0b1101100 0b1101100 0b1101111 0b100000 0b1110111 0b1101111 0b1110010 0b1101100 0b1100100'

You can also use binascii module in python 2:

您还可以在python 2中使用binascii模块:

>>> import binascii
>>> bin(int(binascii.hexlify(st),16))
'0b110100001100101011011000110110001101111001000000111011101101111011100100110110001100100'

hexlify return the hexadecimal representation of the binary data then you can convert to int by specifying 16 as its base then convert it to binary with bin.

hexlify返回二进制数据的十六进制表示,然后你可以通过指定16作为它的基数,然后将它转换成二进制的二进制数据。

#3


11  

You can access the code values for the characters in your string using the ord() built-in function. If you then need to format this in binary, the string.format() method will do the job.

可以使用ord()内置函数访问字符串中的字符的代码值。如果您需要将其格式化为二进制文件,那么string.format()方法将完成此任务。

a = "test"
print(' '.join(format(ord(x), 'b') for x in a))

(Thanks to Ashwini Chaudhary for posting that code snippet.)

(感谢Ashwini Chaudhary发布的代码片段。)

While the above code works in Python 3, this matter gets more complicated if you're assuming any encoding other than UTF-8. In Python 2, strings are byte sequences, and ASCII encoding is assumed by default. In Python 3, strings are assumed to be Unicode, and there's a separate bytes type that acts more like a Python 2 string. If you wish to assume any encoding other than UTF-8, you'll need to specify the encoding.

虽然上面的代码在Python 3中工作,但是如果您假设除了UTF-8以外的任何编码,这个问题会变得更加复杂。在Python 2中,字符串是字节序列,默认情况下是ASCII编码。在Python 3中,字符串被假定为Unicode,并且有一个单独的字节类型,它更像Python 2字符串。如果您希望假设除UTF-8之外的任何编码,您将需要指定编码。

In Python 3, then, you can do something like this:

在python3里,你可以这样做:

a = "test"
a_bytes = bytes(a, "ascii")
print(' '.join(["{0:b}".format(x) for x in a_bytes]))

The differences between UTF-8 and ascii encoding won't be obvious for simple alphanumeric strings, but will become important if you're processing text that includes characters not in the ascii character set.

对于简单的字母数字字符串,UTF-8和ascii编码之间的区别并不明显,但是如果您处理的是包含字符而不是ascii字符集的文本,则会变得很重要。

#4


0  

This is an update for the existing answers which used bytearray() and can not work that way anymore:

这是对现有的答案的更新,它使用了bytearray(),不能再这样工作了:

>>> st = "hello world"
>>> map(bin, bytearray(st))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: string argument without an encoding

Because, as explained in the link above, if the source is a string, you must also give the encoding:

因为,正如上面的链接所解释的,如果源是一个字符串,那么您还必须给出编码:

>>> map(bin, bytearray(st, encoding='utf-8'))
<map object at 0x7f14dfb1ff28>