将字符串分割成2个字母的段[重复]

时间:2022-02-22 10:36:58

This question already has an answer here:

这个问题已经有了答案:

I have a string, which I need to split into 2-letter pieces. For example, 'ABCDXY' should become ['AB', 'CD', 'XY']. The behavior in the case of odd number of characters may be entirely arbitrary (I'll check the length in advance).

我有一个字符串,我需要把它分成两个字母。例如,“ABCDXY”应该变成“AB”、“CD”、“XY”。奇数字符的情况下的行为可能完全是任意的(我将提前检查长度)。

Is there any way to do this without an ugly loop?

有没有什么方法可以做到这一点而不产生一个丑陋的循环?

6 个解决方案

#1


18  

>>> [s[i:i + 2] for i in range(0, len(s), 2)]
['AB', 'CD', 'XY']

#2


16  

Using regular expressions!

使用正则表达式!

>>> import re
>>> s = "ABCDXYv"
>>> re.findall(r'.{1,2}',s,re.DOTALL)
['AB', 'CD', 'XY', 'v']

I know it has been a while, but I came back to this and was curious about which method was better; mine: r'.{1,2}' or Jon's r'..?'. On the surface, Jon's looks much nicer, and I thought it would be much faster than mine, but I was surprised to find otherwise, so I thought I would share:

我知道已经有一段时间了,但我回到这个问题上,我很好奇哪种方法更好;我:r’。{ 1,2 }”或乔恩的r . . ?”。从表面上看,琼恩的脸色看起来好多了,我想这比我的要快得多,但我很惊讶地发现,原来是这样的,所以我想我应该分享一下:

>>> import timeit
>>> timeit.Timer("re.findall(r'.{1,2}', 'ABCDXYv')", setup='import re').repeat()
[1.9064299485802252, 1.8369554649334674, 1.8548105833383772]
>>> timeit.Timer("re.findall(r'..?', 'ABCDXYv')", setup='import re').repeat()
[1.9142223469651611, 1.8670038395145383, 1.85781945659771]

Which shows that indeed r'.{1,2}' is the better/faster choice. (But only slightly)

这显示了r'{1,2}是更好的/更快的选择。(但仅略)

#3


2  

You could try:

你可以试试:

s = 'ABCDEFG'
r = [s[i:i+2] for i in xrange(0, len(s), 2)]

# r is ['AB', 'CD', 'EF', 'G']

UPDATE 2

更新2

If you don't care about odd chars, you could use a regex (avoiding the loop):

如果您不关心奇数字符,您可以使用regex(避免循环):

s = 'ABCDEFG'
r = re.compile('(..)').findall(s)
# r is ['AB', 'CD', 'EF']

#4


1  

There's nothing ugly about the perfectly Pythonic:

完美的毕达哥拉斯式没有什么不好的:

string = 'ABCDXY'
[string[i:i+2] for i in xrange(0, len(string), 2)]

You could also use the following (from - http://docs.python.org/library/itertools.html):

您还可以使用以下(from - http://docs.python.org/library/itertools.html):

def grouper(n, iterable, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

(Which depending how you look at it - may or may not be using 'loops' ;))

(这取决于您如何看待它——可能使用“循环”,也可能不使用;)

or something like:

或者类似的:

re.findall('..?', string)

#5


0  

Yet another solution, this one built on zip and a slice stride:

另一种解决方案是,这个方案建立在zip基础上,并进行了一段大步:

map(''.join, itertools.izip_longest(mystr[::2], mystr[1::2], fillvalue=''))

It does handle odd-length inputs.

它确实处理奇数长度的输入。

#6


0  

Here's a yet another solution without explicit loops (though @Emmanuel's answer is the most appropriate for your question):

这里还有一个没有明确循环的解决方案(尽管@Emmanuel的答案最适合你的问题):

s = 'abcdef'
L = zip(s[::2], s[1::2])
# -> [('a', 'b'), ('c', 'd'), ('e', 'f')]

To get strings:

得到字符串:

print map(''.join, L)
# ['ab', 'cd', 'ef']

On Python 3 wrap using list() where necessary.

在Python 3中,必要时使用list()进行包装。

#1


18  

>>> [s[i:i + 2] for i in range(0, len(s), 2)]
['AB', 'CD', 'XY']

#2


16  

Using regular expressions!

使用正则表达式!

>>> import re
>>> s = "ABCDXYv"
>>> re.findall(r'.{1,2}',s,re.DOTALL)
['AB', 'CD', 'XY', 'v']

I know it has been a while, but I came back to this and was curious about which method was better; mine: r'.{1,2}' or Jon's r'..?'. On the surface, Jon's looks much nicer, and I thought it would be much faster than mine, but I was surprised to find otherwise, so I thought I would share:

我知道已经有一段时间了,但我回到这个问题上,我很好奇哪种方法更好;我:r’。{ 1,2 }”或乔恩的r . . ?”。从表面上看,琼恩的脸色看起来好多了,我想这比我的要快得多,但我很惊讶地发现,原来是这样的,所以我想我应该分享一下:

>>> import timeit
>>> timeit.Timer("re.findall(r'.{1,2}', 'ABCDXYv')", setup='import re').repeat()
[1.9064299485802252, 1.8369554649334674, 1.8548105833383772]
>>> timeit.Timer("re.findall(r'..?', 'ABCDXYv')", setup='import re').repeat()
[1.9142223469651611, 1.8670038395145383, 1.85781945659771]

Which shows that indeed r'.{1,2}' is the better/faster choice. (But only slightly)

这显示了r'{1,2}是更好的/更快的选择。(但仅略)

#3


2  

You could try:

你可以试试:

s = 'ABCDEFG'
r = [s[i:i+2] for i in xrange(0, len(s), 2)]

# r is ['AB', 'CD', 'EF', 'G']

UPDATE 2

更新2

If you don't care about odd chars, you could use a regex (avoiding the loop):

如果您不关心奇数字符,您可以使用regex(避免循环):

s = 'ABCDEFG'
r = re.compile('(..)').findall(s)
# r is ['AB', 'CD', 'EF']

#4


1  

There's nothing ugly about the perfectly Pythonic:

完美的毕达哥拉斯式没有什么不好的:

string = 'ABCDXY'
[string[i:i+2] for i in xrange(0, len(string), 2)]

You could also use the following (from - http://docs.python.org/library/itertools.html):

您还可以使用以下(from - http://docs.python.org/library/itertools.html):

def grouper(n, iterable, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

(Which depending how you look at it - may or may not be using 'loops' ;))

(这取决于您如何看待它——可能使用“循环”,也可能不使用;)

or something like:

或者类似的:

re.findall('..?', string)

#5


0  

Yet another solution, this one built on zip and a slice stride:

另一种解决方案是,这个方案建立在zip基础上,并进行了一段大步:

map(''.join, itertools.izip_longest(mystr[::2], mystr[1::2], fillvalue=''))

It does handle odd-length inputs.

它确实处理奇数长度的输入。

#6


0  

Here's a yet another solution without explicit loops (though @Emmanuel's answer is the most appropriate for your question):

这里还有一个没有明确循环的解决方案(尽管@Emmanuel的答案最适合你的问题):

s = 'abcdef'
L = zip(s[::2], s[1::2])
# -> [('a', 'b'), ('c', 'd'), ('e', 'f')]

To get strings:

得到字符串:

print map(''.join, L)
# ['ab', 'cd', 'ef']

On Python 3 wrap using list() where necessary.

在Python 3中,必要时使用list()进行包装。