如何将一个列表分割成平均大小的块?

时间:2021-07-14 03:19:50

I have a list of arbitrary length, and I need to split it up into equal size chunks and operate on it. There are some obvious ways to do this, like keeping a counter and two lists, and when the second list fills up, add it to the first list and empty the second list for the next round of data, but this is potentially extremely expensive.

我有一个任意长度的列表,我需要把它分成相等大小的块并在上面操作。有一些很明显的方法可以做到这一点,比如保留一个计数器和两个列表,当第二个列表填满时,将它添加到第一个列表中,并清空下一轮数据的第二个列表,但是这可能非常昂贵。

I was wondering if anyone had a good solution to this for lists of any length, e.g. using generators.

我想知道是否有人有一个好的解决方案来列出任何长度的列表,例如使用生成器。

I was looking for something useful in itertools but I couldn't find anything obviously useful. Might've missed it, though.

我在寻找一些有用的迭代工具,但我找不到任何明显有用的东西。不过,可能已经错过了它。

Related question: What is the most “pythonic” way to iterate over a list in chunks?

相关的问题:在一个列表中迭代的最“python”方式是什么?

57 个解决方案

#1


1964  

Here's a generator that yields the chunks you want:

这是一个产生你想要的块的生成器:

def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in range(0, len(l), n):
        yield l[i:i + n]

import pprint
pprint.pprint(list(chunks(range(10, 75), 10)))
[[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
 [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
 [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
 [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
 [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
 [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
 [70, 71, 72, 73, 74]]

If you're using Python 2, you should use xrange() instead of range():

如果您使用的是python2,您应该使用xrange()而不是range():

def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in xrange(0, len(l), n):
        yield l[i:i + n]

Also you can simply use list comprehension instead of writing a function. Python 3:

你也可以简单地使用列表理解而不是写一个函数。Python 3:

[l[i:i + n] for i in range(0, len(l), n)]

Python 2 version:

Python 2版本:

[l[i:i + n] for i in xrange(0, len(l), n)]

#2


473  

If you want something super simple:

如果你想要超简单的东西:

def chunks(l, n):
    n = max(1, n)
    return (l[i:i+n] for i in xrange(0, len(l), n))

#3


245  

Directly from the (old) Python documentation (recipes for itertools):

直接从(旧的)Python文档(itertools的菜谱):

from itertools import izip, chain, repeat

def grouper(n, iterable, padvalue=None):
    "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
    return izip(*[chain(iterable, repeat(padvalue, n-1))]*n)

The current version, as suggested by J.F.Sebastian:

正如约翰·f·塞巴斯蒂安建议的那样:

#from itertools import izip_longest as zip_longest # for Python 2.x
from itertools import zip_longest # for Python 3.x
#from six.moves import zip_longest # for both (uses the six compat library)

def grouper(n, iterable, padvalue=None):
    "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
    return zip_longest(*[iter(iterable)]*n, fillvalue=padvalue)

I guess Guido's time machine works—worked—will work—will have worked—was working again.

我猜圭多的时间机器工作——将会工作——会重新开始工作。

These solutions work because [iter(iterable)]*n (or the equivalent in the earlier version) creates one iterator, repeated n times in the list. izip_longest then effectively performs a round-robin of "each" iterator; because this is the same iterator, it is advanced by each such call, resulting in each such zip-roundrobin generating one tuple of n items.

这些解决方案之所以有效,是因为[iter(iterable)]*n(或早期版本中的等效项)创建一个迭代器,在列表中重复n次。izip_long然后有效地执行“每个”迭代器的循环;因为这是相同的迭代器,因此每次调用都是高级的,从而导致每个这样的zip-roundrobin生成一个n个项目的tuple。

#4


85  

I know this is kind of old but I don't why nobody mentioned numpy.array_split:

我知道这有点老,但我不知道为什么没人提到numpy.array_split:

lst = range(50)
In [26]: np.array_split(lst,5)
Out[26]: 
[array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
 array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]),
 array([20, 21, 22, 23, 24, 25, 26, 27, 28, 29]),
 array([30, 31, 32, 33, 34, 35, 36, 37, 38, 39]),
 array([40, 41, 42, 43, 44, 45, 46, 47, 48, 49])]

#5


72  

Here is a generator that work on arbitrary iterables:

这是一个在任意的iterables上工作的生成器:

def split_seq(iterable, size):
    it = iter(iterable)
    item = list(itertools.islice(it, size))
    while item:
        yield item
        item = list(itertools.islice(it, size))

Example:

例子:

>>> import pprint
>>> pprint.pprint(list(split_seq(xrange(75), 10)))
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
 [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
 [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
 [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
 [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
 [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
 [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
 [70, 71, 72, 73, 74]]

#6


56  

I'm surprised nobody has thought of using iter's two-argument form:

我很惊讶没有人想到使用iter的两个参数形式:

from itertools import islice

def chunk(it, size):
    it = iter(it)
    return iter(lambda: tuple(islice(it, size)), ())

Demo:

演示:

>>> list(chunk(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)]

This works with any iterable and produces output lazily. It returns tuples rather than iterators, but I think it has a certain elegance nonetheless. It also doesn't pad; if you want padding, a simple variation on the above will suffice:

这适用于任何可迭代的,并可延迟地产生输出。它返回元组而不是迭代器,但我认为它仍然具有一定的优雅性。它也不垫;如果你想要填充,上面的简单变化就足够了:

from itertools import islice, chain, repeat

def chunk_pad(it, size, padval=None):
    it = chain(iter(it), repeat(padval))
    return iter(lambda: tuple(islice(it, size)), (padval,) * size)

Demo:

演示:

>>> list(chunk_pad(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)]
>>> list(chunk_pad(range(14), 3, 'a'))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 'a')]

Like the izip_longest-based solutions, the above always pads. As far as I know, there's no one- or two-line itertools recipe for a function that optionally pads. By combining the above two approaches, this one comes pretty close:

就像izip_最长的解决方案一样,上面总是有垫。据我所知,没有一种或两行迭代工具可以选择一个可选的功能。结合以上两种方法,这一方法非常接近:

_no_padding = object()

def chunk(it, size, padval=_no_padding):
    if padval == _no_padding:
        it = iter(it)
        sentinel = ()
    else:
        it = chain(iter(it), repeat(padval))
        sentinel = (padval,) * size
    return iter(lambda: tuple(islice(it, size)), sentinel)

Demo:

演示:

>>> list(chunk(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)]
>>> list(chunk(range(14), 3, None))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)]
>>> list(chunk(range(14), 3, 'a'))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 'a')]

I believe this is the shortest chunker proposed that offers optional padding.

我相信这是一个最短的chunker提议,提供可选的填充。

#7


47  

def chunk(input, size):
    return map(None, *([iter(input)] * size))

#8


36  

Simple yet elegant

简单而优雅

l = range(1, 1000)
print [l[x:x+10] for x in xrange(0, len(l), 10)]

or if you prefer:

或者如果你喜欢:

chunks = lambda l, n: [l[x: x+n] for x in xrange(0, len(l), n)]
chunks(l, 10)

#9


27  

I saw the most awesome Python-ish answer in a duplicate of this question:

我在这个问题的副本中看到了最令人敬畏的python式答案:

from itertools import zip_longest

a = range(1, 16)
i = iter(a)
r = list(zip_longest(i, i, i))
>>> print(r)
[(1, 2, 3), (4, 5, 6), (7, 8, 9), (10, 11, 12), (13, 14, 15)]

You can create n-tuple for any n. If a = range(1, 15), then the result will be:

可以为任意n创建n元组,如果a = range(1,15),则结果为:

[(1, 2, 3), (4, 5, 6), (7, 8, 9), (10, 11, 12), (13, 14, None)]

If the list is divided evenly, then you can replace zip_longest with zip, otherwise the triplet (13, 14, None) would be lost. Python 3 is used above. For Python 2, use izip_longest.

如果列表是平均分配的,那么您可以用zip替换zip_length,否则将丢失triplet (13, 14, None)。Python 3在上面使用。对于Python 2,使用izip_long。

#10


23  

Critique of other answers here:

None of these answers are evenly sized chunks, they all leave a runt chunk at the end, so they're not completely balanced. If you were using these functions to distribute work, you've built-in the prospect of one likely finishing well before the others, so it would sit around doing nothing while the others continued working hard.

这些答案中没有一个是平均大小的块,它们在最后都会留下一个小块,所以它们不是完全平衡的。如果你使用这些功能来分配工作,你就已经内置了一个很可能在其他人之前完成的任务,所以当其他人继续努力工作的时候,它会无所事事地无所事事。

For example, the current top answer ends with:

例如,当前的最高答案以:

[60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
[70, 71, 72, 73, 74]]

I just hate that runt at the end!

我只是讨厌最后的那个笨蛋!

Others, like list(grouper(3, xrange(7))), and chunk(xrange(7), 3) both return: [(0, 1, 2), (3, 4, 5), (6, None, None)]. The None's are just padding, and rather inelegant in my opinion. They are NOT evenly chunking the iterables.

其他的,如list(grouper(3, xrange(7)))和chunk(xrange(7), 3)都返回:[(0,1,2),(3,4,5),(6,None, None)]。在我看来,没有人是虚情假意的。它们不均匀地组块。

Why can't we divide these better?

为什么我们不能更好地划分这些?

My Solution(s)

Here's a balanced solution, adapted from a function I've used in production (Note in Python 3 to replace xrange with range):

这里有一个平衡的解决方案,它是根据我在生产中使用的一个函数(在Python 3中为替换xrange而编写的):

def baskets_from(items, maxbaskets=25):
    baskets = [[] for _ in xrange(maxbaskets)] # in Python 3 use range
    for i, item in enumerate(items):
        baskets[i % maxbaskets].append(item)
    return filter(None, baskets) 

And I created a generator that does the same if you put it into a list:

我创建了一个生成器,如果你把它放到一个列表中,它也会这样做:

def iter_baskets_from(items, maxbaskets=3):
    '''generates evenly balanced baskets from indexable iterable'''
    item_count = len(items)
    baskets = min(item_count, maxbaskets)
    for x_i in xrange(baskets):
        yield [items[y_i] for y_i in xrange(x_i, item_count, baskets)]

And finally, since I see that all of the above functions return elements in a contiguous order (as they were given):

最后,由于我看到上述所有函数都以连续的顺序返回元素(如下所示):

def iter_baskets_contiguous(items, maxbaskets=3, item_count=None):
    '''
    generates balanced baskets from iterable, contiguous contents
    provide item_count if providing a iterator that doesn't support len()
    '''
    item_count = item_count or len(items)
    baskets = min(item_count, maxbaskets)
    items = iter(items)
    floor = item_count // baskets 
    ceiling = floor + 1
    stepdown = item_count % baskets
    for x_i in xrange(baskets):
        length = ceiling if x_i < stepdown else floor
        yield [items.next() for _ in xrange(length)]

Output

To test them out:

进行测试:

print(baskets_from(xrange(6), 8))
print(list(iter_baskets_from(xrange(6), 8)))
print(list(iter_baskets_contiguous(xrange(6), 8)))
print(baskets_from(xrange(22), 8))
print(list(iter_baskets_from(xrange(22), 8)))
print(list(iter_baskets_contiguous(xrange(22), 8)))
print(baskets_from('ABCDEFG', 3))
print(list(iter_baskets_from('ABCDEFG', 3)))
print(list(iter_baskets_contiguous('ABCDEFG', 3)))
print(baskets_from(xrange(26), 5))
print(list(iter_baskets_from(xrange(26), 5)))
print(list(iter_baskets_contiguous(xrange(26), 5)))

Which prints out:

打印出:

[[0], [1], [2], [3], [4], [5]]
[[0], [1], [2], [3], [4], [5]]
[[0], [1], [2], [3], [4], [5]]
[[0, 8, 16], [1, 9, 17], [2, 10, 18], [3, 11, 19], [4, 12, 20], [5, 13, 21], [6, 14], [7, 15]]
[[0, 8, 16], [1, 9, 17], [2, 10, 18], [3, 11, 19], [4, 12, 20], [5, 13, 21], [6, 14], [7, 15]]
[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11], [12, 13, 14], [15, 16, 17], [18, 19], [20, 21]]
[['A', 'D', 'G'], ['B', 'E'], ['C', 'F']]
[['A', 'D', 'G'], ['B', 'E'], ['C', 'F']]
[['A', 'B', 'C'], ['D', 'E'], ['F', 'G']]
[[0, 5, 10, 15, 20, 25], [1, 6, 11, 16, 21], [2, 7, 12, 17, 22], [3, 8, 13, 18, 23], [4, 9, 14, 19, 24]]
[[0, 5, 10, 15, 20, 25], [1, 6, 11, 16, 21], [2, 7, 12, 17, 22], [3, 8, 13, 18, 23], [4, 9, 14, 19, 24]]
[[0, 1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15], [16, 17, 18, 19, 20], [21, 22, 23, 24, 25]]

Notice that the contiguous generator provide chunks in the same length patterns as the other two, but the items are all in order, and they are as evenly divided as one may divide a list of discrete elements.

注意,相邻的生成器提供与另外两个相同的长度模式的块,但是这些项都是有序的,并且它们被平均分配,因为它们可以划分一个离散元素的列表。

#11


22  

more-itertools has a chunks iterator.

更多的迭代工具有一个块迭代器。

It also has a lot more things, including all the recipes in the itertools documentation.

它还有很多东西,包括itertools文档中的所有菜谱。

#12


15  

If you had a chunk size of 3 for example, you could do:

如果你的数据块大小为3,你可以这样做:

zip(*[iterable[i::3] for i in range(3)]) 

source: http://code.activestate.com/recipes/303060-group-a-list-into-sequential-n-tuples/

来源:http://code.activestate.com/recipes/303060-group-a-list-into-sequential-n-tuples/

I would use this when my chunk size is fixed number I can type, e.g. '3', and would never change.

当我的块大小是固定的数字时,我会用这个。“3”,永远不变。

#13


15  

If you know list size:

如果你知道列表的大小:

def SplitList(list, chunk_size):
    return [list[offs:offs+chunk_size] for offs in range(0, len(list), chunk_size)]

If you don't (an iterator):

如果您不(迭代器):

def IterChunks(sequence, chunk_size):
    res = []
    for item in sequence:
        res.append(item)
        if len(res) >= chunk_size:
            yield res
            res = []
    if res:
        yield res  # yield the last, incomplete, portion

In the latter case, it can be rephrased in a more beautiful way if you can be sure that the sequence always contains a whole number of chunks of given size (i.e. there is no incomplete last chunk).

在后一种情况下,如果您可以确信序列总是包含给定大小的整数块(也就是说没有不完整的最后一块),那么可以用更漂亮的方式来重新措辞。

#14


14  

A generator expression:

一个生成器表达式:

def chunks(seq, n):
    return (seq[i:i+n] for i in xrange(0, len(seq), n))

eg.

如。

print list(chunks(range(1, 1000), 10))

#15


13  

I like the Python doc's version proposed by tzot and J.F.Sebastian a lot, but it has two shortcomings:

我喜欢tzot和J.F.提出的Python文档版本。Sebastian有很多,但它有两个缺点:

  • it is not very explicit
  • 它不是很明确。
  • I usually don't want a fill value in the last chunk
  • 我通常不希望在最后一个块中有填充值。

I'm using this one a lot in my code:

我在我的代码中使用了很多

from itertools import islice

def chunks(n, iterable):
    iterable = iter(iterable)
    while True:
        yield tuple(islice(iterable, n)) or iterable.next()

UPDATE: A lazy chunks version:

更新:一个懒惰的块版本:

from itertools import chain, islice

def chunks(n, iterable):
   iterable = iter(iterable)
   while True:
       yield chain([next(iterable)], islice(iterable, n-1))

#16


10  

At this point, I think we need a recursive generator, just in case...

在这一点上,我认为我们需要一个递归生成器,以防万一……

In python 2:

在python中2:

def chunks(li, n):
    if li == []:
        return
    yield li[:n]
    for e in chunks(li[n:], n):
        yield e

In python 3:

在python 3:

def chunks(li, n):
    if li == []:
        return
    yield li[:n]
    yield from chunks(li[n:], n)

Also, in case of massive Alien invasion, a decorated recursive generator might become handy:

此外,在大规模外星人入侵的情况下,一个装饰过的递归发电机可能会派上用场:

def dec(gen):
    def new_gen(li, n):
        for e in gen(li, n):
            if e == []:
                return
            yield e
    return new_gen

@dec
def chunks(li, n):
    yield li[:n]
    for e in chunks(li[n:], n):
        yield e

#17


9  

The toolz library has the partition function for this:

toolz库有这个分区函数:

from toolz.itertoolz.core import partition

list(partition(2, [1, 2, 3, 4]))
[(1, 2), (3, 4)]

#18


8  

You may also use get_chunks function of utilspie library as:

您还可以使用utilspie库的get_chunks函数为:

>>> from utilspie import iterutils
>>> a = [1, 2, 3, 4, 5, 6, 7, 8, 9]

>>> list(iterutils.get_chunks(a, 5))
[[1, 2, 3, 4, 5], [6, 7, 8, 9]]

You can install utilspie via pip:

您可以通过pip安装utilspie:

sudo pip install utilspie

Disclaimer: I am the creator of utilspie library.

免责声明:我是utilspie库的创建者。

#19


7  

[AA[i:i+SS] for i in range(len(AA))[::SS]]

Where AA is array, SS is chunk size. For example:

其中AA是数组,SS是块大小。例如:

>>> AA=range(10,21);SS=3
>>> [AA[i:i+SS] for i in range(len(AA))[::SS]]
[[10, 11, 12], [13, 14, 15], [16, 17, 18], [19, 20]]
# or [range(10, 13), range(13, 16), range(16, 19), range(19, 21)] in py3

#20


6  

def split_seq(seq, num_pieces):
    start = 0
    for i in xrange(num_pieces):
        stop = start + len(seq[i::num_pieces])
        yield seq[start:stop]
        start = stop

usage:

用法:

seq = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

for seq in split_seq(seq, 3):
    print seq

#21


6  

code:

代码:

def split_list(the_list, chunk_size):
    result_list = []
    while the_list:
        result_list.append(the_list[:chunk_size])
        the_list = the_list[chunk_size:]
    return result_list

a_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

print split_list(a_list, 3)

result:

结果:

[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]

#22


6  

I was curious about the performance of different approaches and here it is:

我对不同方法的表现很好奇,这里是:

Tested on Python 3.5.1

Python测试3.5.1

import time
batch_size = 7
arr_len = 298937

#---------slice-------------

print("\r\nslice")
start = time.time()
arr = [i for i in range(0, arr_len)]
while True:
    if not arr:
        break

    tmp = arr[0:batch_size]
    arr = arr[batch_size:-1]
print(time.time() - start)

#-----------index-----------

print("\r\nindex")
arr = [i for i in range(0, arr_len)]
start = time.time()
for i in range(0, round(len(arr) / batch_size + 1)):
    tmp = arr[batch_size * i : batch_size * (i + 1)]
print(time.time() - start)

#----------batches 1------------

def batch(iterable, n=1):
    l = len(iterable)
    for ndx in range(0, l, n):
        yield iterable[ndx:min(ndx + n, l)]

print("\r\nbatches 1")
arr = [i for i in range(0, arr_len)]
start = time.time()
for x in batch(arr, batch_size):
    tmp = x
print(time.time() - start)

#----------batches 2------------

from itertools import islice, chain

def batch(iterable, size):
    sourceiter = iter(iterable)
    while True:
        batchiter = islice(sourceiter, size)
        yield chain([next(batchiter)], batchiter)


print("\r\nbatches 2")
arr = [i for i in range(0, arr_len)]
start = time.time()
for x in batch(arr, batch_size):
    tmp = x
print(time.time() - start)

#---------chunks-------------
def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in range(0, len(l), n):
        yield l[i:i + n]
print("\r\nchunks")
arr = [i for i in range(0, arr_len)]
start = time.time()
for x in chunks(arr, batch_size):
    tmp = x
print(time.time() - start)

#-----------grouper-----------

from itertools import zip_longest # for Python 3.x
#from six.moves import zip_longest # for both (uses the six compat library)

def grouper(iterable, n, padvalue=None):
    "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
    return zip_longest(*[iter(iterable)]*n, fillvalue=padvalue)

arr = [i for i in range(0, arr_len)]
print("\r\ngrouper")
start = time.time()
for x in grouper(arr, batch_size):
    tmp = x
print(time.time() - start)

Results:

结果:

slice
31.18285083770752

index
0.02184295654296875

batches 1
0.03503894805908203

batches 2
0.22681021690368652

chunks
0.019841909408569336

grouper
0.006506919860839844

#23


5  

heh, one line version

嘿,一行版本

In [48]: chunk = lambda ulist, step:  map(lambda i: ulist[i:i+step],  xrange(0, len(ulist), step))

In [49]: chunk(range(1,100), 10)
Out[49]: 
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
 [11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
 [21, 22, 23, 24, 25, 26, 27, 28, 29, 30],
 [31, 32, 33, 34, 35, 36, 37, 38, 39, 40],
 [41, 42, 43, 44, 45, 46, 47, 48, 49, 50],
 [51, 52, 53, 54, 55, 56, 57, 58, 59, 60],
 [61, 62, 63, 64, 65, 66, 67, 68, 69, 70],
 [71, 72, 73, 74, 75, 76, 77, 78, 79, 80],
 [81, 82, 83, 84, 85, 86, 87, 88, 89, 90],
 [91, 92, 93, 94, 95, 96, 97, 98, 99]]

#24


5  

Another more explicit version.

另一个更明确的版本。

def chunkList(initialList, chunkSize):
    """
    This function chunks a list into sub lists 
    that have a length equals to chunkSize.

    Example:
    lst = [3, 4, 9, 7, 1, 1, 2, 3]
    print(chunkList(lst, 3)) 
    returns
    [[3, 4, 9], [7, 1, 1], [2, 3]]
    """
    finalList = []
    for i in range(0, len(initialList), chunkSize):
        finalList.append(initialList[i:i+chunkSize])
    return finalList

#25


4  

Consider using matplotlib.cbook pieces

考虑使用matplotlib。图书贝贝碎片

for example:

例如:

import matplotlib.cbook as cbook
segments = cbook.pieces(np.arange(20), 3)
for s in segments:
     print s

#26


4  

a = [1, 2, 3, 4, 5, 6, 7, 8, 9]
CHUNK = 4
[a[i*CHUNK:(i+1)*CHUNK] for i in xrange((len(a) + CHUNK - 1) / CHUNK )]

#27


4  

Without calling len() which is good for large lists:

不调用len()对大列表有好处:

def splitter(l, n):
    i = 0
    chunk = l[:n]
    while chunk:
        yield chunk
        i += n
        chunk = l[i:i+n]

And this is for iterables:

这是用于迭代的:

def isplitter(l, n):
    l = iter(l)
    chunk = list(islice(l, n))
    while chunk:
        yield chunk
        chunk = list(islice(l, n))

The functional flavour of the above:

上面的功能味道:

def isplitter2(l, n):
    return takewhile(bool,
                     (tuple(islice(start, n))
                            for start in repeat(iter(l))))

OR:

或者:

def chunks_gen_sentinel(n, seq):
    continuous_slices = imap(islice, repeat(iter(seq)), repeat(0), repeat(n))
    return iter(imap(tuple, continuous_slices).next,())

OR:

或者:

def chunks_gen_filter(n, seq):
    continuous_slices = imap(islice, repeat(iter(seq)), repeat(0), repeat(n))
    return takewhile(bool,imap(tuple, continuous_slices))

#28


3  

def chunks(iterable,n):
    """assumes n is an integer>0
    """
    iterable=iter(iterable)
    while True:
        result=[]
        for i in range(n):
            try:
                a=next(iterable)
            except StopIteration:
                break
            else:
                result.append(a)
        if result:
            yield result
        else:
            break

g1=(i*i for i in range(10))
g2=chunks(g1,3)
print g2
'<generator object chunks at 0x0337B9B8>'
print list(g2)
'[[0, 1, 4], [9, 16, 25], [36, 49, 64], [81]]'

#29


3  

I realise this question is old (stumbled over it on Google), but surely something like the following is far simpler and clearer than any of the huge complex suggestions and only uses slicing:

我意识到这个问题已经过时了(在谷歌上被发现了),但是像下面这样的事情比任何一个复杂的建议都要简单得多,而且只使用切片:

def chunker(iterable, chunksize):
    for i,c in enumerate(iterable[::chunksize]):
        yield iterable[i*chunksize:(i+1)*chunksize]

>>> for chunk in chunker(range(0,100), 10):
...     print list(chunk)
... 
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
[20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
... etc ...

#30


3  

See this reference

看到这个参考

>>> orange = range(1, 1001)
>>> otuples = list( zip(*[iter(orange)]*10))
>>> print(otuples)
[(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), ... (991, 992, 993, 994, 995, 996, 997, 998, 999, 1000)]
>>> olist = [list(i) for i in otuples]
>>> print(olist)
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], ..., [991, 992, 993, 994, 995, 996, 997, 998, 999, 1000]]
>>> 

Python3

Python3

#1


1964  

Here's a generator that yields the chunks you want:

这是一个产生你想要的块的生成器:

def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in range(0, len(l), n):
        yield l[i:i + n]

import pprint
pprint.pprint(list(chunks(range(10, 75), 10)))
[[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
 [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
 [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
 [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
 [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
 [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
 [70, 71, 72, 73, 74]]

If you're using Python 2, you should use xrange() instead of range():

如果您使用的是python2,您应该使用xrange()而不是range():

def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in xrange(0, len(l), n):
        yield l[i:i + n]

Also you can simply use list comprehension instead of writing a function. Python 3:

你也可以简单地使用列表理解而不是写一个函数。Python 3:

[l[i:i + n] for i in range(0, len(l), n)]

Python 2 version:

Python 2版本:

[l[i:i + n] for i in xrange(0, len(l), n)]

#2


473  

If you want something super simple:

如果你想要超简单的东西:

def chunks(l, n):
    n = max(1, n)
    return (l[i:i+n] for i in xrange(0, len(l), n))

#3


245  

Directly from the (old) Python documentation (recipes for itertools):

直接从(旧的)Python文档(itertools的菜谱):

from itertools import izip, chain, repeat

def grouper(n, iterable, padvalue=None):
    "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
    return izip(*[chain(iterable, repeat(padvalue, n-1))]*n)

The current version, as suggested by J.F.Sebastian:

正如约翰·f·塞巴斯蒂安建议的那样:

#from itertools import izip_longest as zip_longest # for Python 2.x
from itertools import zip_longest # for Python 3.x
#from six.moves import zip_longest # for both (uses the six compat library)

def grouper(n, iterable, padvalue=None):
    "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
    return zip_longest(*[iter(iterable)]*n, fillvalue=padvalue)

I guess Guido's time machine works—worked—will work—will have worked—was working again.

我猜圭多的时间机器工作——将会工作——会重新开始工作。

These solutions work because [iter(iterable)]*n (or the equivalent in the earlier version) creates one iterator, repeated n times in the list. izip_longest then effectively performs a round-robin of "each" iterator; because this is the same iterator, it is advanced by each such call, resulting in each such zip-roundrobin generating one tuple of n items.

这些解决方案之所以有效,是因为[iter(iterable)]*n(或早期版本中的等效项)创建一个迭代器,在列表中重复n次。izip_long然后有效地执行“每个”迭代器的循环;因为这是相同的迭代器,因此每次调用都是高级的,从而导致每个这样的zip-roundrobin生成一个n个项目的tuple。

#4


85  

I know this is kind of old but I don't why nobody mentioned numpy.array_split:

我知道这有点老,但我不知道为什么没人提到numpy.array_split:

lst = range(50)
In [26]: np.array_split(lst,5)
Out[26]: 
[array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
 array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]),
 array([20, 21, 22, 23, 24, 25, 26, 27, 28, 29]),
 array([30, 31, 32, 33, 34, 35, 36, 37, 38, 39]),
 array([40, 41, 42, 43, 44, 45, 46, 47, 48, 49])]

#5


72  

Here is a generator that work on arbitrary iterables:

这是一个在任意的iterables上工作的生成器:

def split_seq(iterable, size):
    it = iter(iterable)
    item = list(itertools.islice(it, size))
    while item:
        yield item
        item = list(itertools.islice(it, size))

Example:

例子:

>>> import pprint
>>> pprint.pprint(list(split_seq(xrange(75), 10)))
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
 [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
 [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
 [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
 [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
 [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
 [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
 [70, 71, 72, 73, 74]]

#6


56  

I'm surprised nobody has thought of using iter's two-argument form:

我很惊讶没有人想到使用iter的两个参数形式:

from itertools import islice

def chunk(it, size):
    it = iter(it)
    return iter(lambda: tuple(islice(it, size)), ())

Demo:

演示:

>>> list(chunk(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)]

This works with any iterable and produces output lazily. It returns tuples rather than iterators, but I think it has a certain elegance nonetheless. It also doesn't pad; if you want padding, a simple variation on the above will suffice:

这适用于任何可迭代的,并可延迟地产生输出。它返回元组而不是迭代器,但我认为它仍然具有一定的优雅性。它也不垫;如果你想要填充,上面的简单变化就足够了:

from itertools import islice, chain, repeat

def chunk_pad(it, size, padval=None):
    it = chain(iter(it), repeat(padval))
    return iter(lambda: tuple(islice(it, size)), (padval,) * size)

Demo:

演示:

>>> list(chunk_pad(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)]
>>> list(chunk_pad(range(14), 3, 'a'))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 'a')]

Like the izip_longest-based solutions, the above always pads. As far as I know, there's no one- or two-line itertools recipe for a function that optionally pads. By combining the above two approaches, this one comes pretty close:

就像izip_最长的解决方案一样,上面总是有垫。据我所知,没有一种或两行迭代工具可以选择一个可选的功能。结合以上两种方法,这一方法非常接近:

_no_padding = object()

def chunk(it, size, padval=_no_padding):
    if padval == _no_padding:
        it = iter(it)
        sentinel = ()
    else:
        it = chain(iter(it), repeat(padval))
        sentinel = (padval,) * size
    return iter(lambda: tuple(islice(it, size)), sentinel)

Demo:

演示:

>>> list(chunk(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)]
>>> list(chunk(range(14), 3, None))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)]
>>> list(chunk(range(14), 3, 'a'))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 'a')]

I believe this is the shortest chunker proposed that offers optional padding.

我相信这是一个最短的chunker提议,提供可选的填充。

#7


47  

def chunk(input, size):
    return map(None, *([iter(input)] * size))

#8


36  

Simple yet elegant

简单而优雅

l = range(1, 1000)
print [l[x:x+10] for x in xrange(0, len(l), 10)]

or if you prefer:

或者如果你喜欢:

chunks = lambda l, n: [l[x: x+n] for x in xrange(0, len(l), n)]
chunks(l, 10)

#9


27  

I saw the most awesome Python-ish answer in a duplicate of this question:

我在这个问题的副本中看到了最令人敬畏的python式答案:

from itertools import zip_longest

a = range(1, 16)
i = iter(a)
r = list(zip_longest(i, i, i))
>>> print(r)
[(1, 2, 3), (4, 5, 6), (7, 8, 9), (10, 11, 12), (13, 14, 15)]

You can create n-tuple for any n. If a = range(1, 15), then the result will be:

可以为任意n创建n元组,如果a = range(1,15),则结果为:

[(1, 2, 3), (4, 5, 6), (7, 8, 9), (10, 11, 12), (13, 14, None)]

If the list is divided evenly, then you can replace zip_longest with zip, otherwise the triplet (13, 14, None) would be lost. Python 3 is used above. For Python 2, use izip_longest.

如果列表是平均分配的,那么您可以用zip替换zip_length,否则将丢失triplet (13, 14, None)。Python 3在上面使用。对于Python 2,使用izip_long。

#10


23  

Critique of other answers here:

None of these answers are evenly sized chunks, they all leave a runt chunk at the end, so they're not completely balanced. If you were using these functions to distribute work, you've built-in the prospect of one likely finishing well before the others, so it would sit around doing nothing while the others continued working hard.

这些答案中没有一个是平均大小的块,它们在最后都会留下一个小块,所以它们不是完全平衡的。如果你使用这些功能来分配工作,你就已经内置了一个很可能在其他人之前完成的任务,所以当其他人继续努力工作的时候,它会无所事事地无所事事。

For example, the current top answer ends with:

例如,当前的最高答案以:

[60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
[70, 71, 72, 73, 74]]

I just hate that runt at the end!

我只是讨厌最后的那个笨蛋!

Others, like list(grouper(3, xrange(7))), and chunk(xrange(7), 3) both return: [(0, 1, 2), (3, 4, 5), (6, None, None)]. The None's are just padding, and rather inelegant in my opinion. They are NOT evenly chunking the iterables.

其他的,如list(grouper(3, xrange(7)))和chunk(xrange(7), 3)都返回:[(0,1,2),(3,4,5),(6,None, None)]。在我看来,没有人是虚情假意的。它们不均匀地组块。

Why can't we divide these better?

为什么我们不能更好地划分这些?

My Solution(s)

Here's a balanced solution, adapted from a function I've used in production (Note in Python 3 to replace xrange with range):

这里有一个平衡的解决方案,它是根据我在生产中使用的一个函数(在Python 3中为替换xrange而编写的):

def baskets_from(items, maxbaskets=25):
    baskets = [[] for _ in xrange(maxbaskets)] # in Python 3 use range
    for i, item in enumerate(items):
        baskets[i % maxbaskets].append(item)
    return filter(None, baskets) 

And I created a generator that does the same if you put it into a list:

我创建了一个生成器,如果你把它放到一个列表中,它也会这样做:

def iter_baskets_from(items, maxbaskets=3):
    '''generates evenly balanced baskets from indexable iterable'''
    item_count = len(items)
    baskets = min(item_count, maxbaskets)
    for x_i in xrange(baskets):
        yield [items[y_i] for y_i in xrange(x_i, item_count, baskets)]

And finally, since I see that all of the above functions return elements in a contiguous order (as they were given):

最后,由于我看到上述所有函数都以连续的顺序返回元素(如下所示):

def iter_baskets_contiguous(items, maxbaskets=3, item_count=None):
    '''
    generates balanced baskets from iterable, contiguous contents
    provide item_count if providing a iterator that doesn't support len()
    '''
    item_count = item_count or len(items)
    baskets = min(item_count, maxbaskets)
    items = iter(items)
    floor = item_count // baskets 
    ceiling = floor + 1
    stepdown = item_count % baskets
    for x_i in xrange(baskets):
        length = ceiling if x_i < stepdown else floor
        yield [items.next() for _ in xrange(length)]

Output

To test them out:

进行测试:

print(baskets_from(xrange(6), 8))
print(list(iter_baskets_from(xrange(6), 8)))
print(list(iter_baskets_contiguous(xrange(6), 8)))
print(baskets_from(xrange(22), 8))
print(list(iter_baskets_from(xrange(22), 8)))
print(list(iter_baskets_contiguous(xrange(22), 8)))
print(baskets_from('ABCDEFG', 3))
print(list(iter_baskets_from('ABCDEFG', 3)))
print(list(iter_baskets_contiguous('ABCDEFG', 3)))
print(baskets_from(xrange(26), 5))
print(list(iter_baskets_from(xrange(26), 5)))
print(list(iter_baskets_contiguous(xrange(26), 5)))

Which prints out:

打印出:

[[0], [1], [2], [3], [4], [5]]
[[0], [1], [2], [3], [4], [5]]
[[0], [1], [2], [3], [4], [5]]
[[0, 8, 16], [1, 9, 17], [2, 10, 18], [3, 11, 19], [4, 12, 20], [5, 13, 21], [6, 14], [7, 15]]
[[0, 8, 16], [1, 9, 17], [2, 10, 18], [3, 11, 19], [4, 12, 20], [5, 13, 21], [6, 14], [7, 15]]
[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11], [12, 13, 14], [15, 16, 17], [18, 19], [20, 21]]
[['A', 'D', 'G'], ['B', 'E'], ['C', 'F']]
[['A', 'D', 'G'], ['B', 'E'], ['C', 'F']]
[['A', 'B', 'C'], ['D', 'E'], ['F', 'G']]
[[0, 5, 10, 15, 20, 25], [1, 6, 11, 16, 21], [2, 7, 12, 17, 22], [3, 8, 13, 18, 23], [4, 9, 14, 19, 24]]
[[0, 5, 10, 15, 20, 25], [1, 6, 11, 16, 21], [2, 7, 12, 17, 22], [3, 8, 13, 18, 23], [4, 9, 14, 19, 24]]
[[0, 1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15], [16, 17, 18, 19, 20], [21, 22, 23, 24, 25]]

Notice that the contiguous generator provide chunks in the same length patterns as the other two, but the items are all in order, and they are as evenly divided as one may divide a list of discrete elements.

注意,相邻的生成器提供与另外两个相同的长度模式的块,但是这些项都是有序的,并且它们被平均分配,因为它们可以划分一个离散元素的列表。

#11


22  

more-itertools has a chunks iterator.

更多的迭代工具有一个块迭代器。

It also has a lot more things, including all the recipes in the itertools documentation.

它还有很多东西,包括itertools文档中的所有菜谱。

#12


15  

If you had a chunk size of 3 for example, you could do:

如果你的数据块大小为3,你可以这样做:

zip(*[iterable[i::3] for i in range(3)]) 

source: http://code.activestate.com/recipes/303060-group-a-list-into-sequential-n-tuples/

来源:http://code.activestate.com/recipes/303060-group-a-list-into-sequential-n-tuples/

I would use this when my chunk size is fixed number I can type, e.g. '3', and would never change.

当我的块大小是固定的数字时,我会用这个。“3”,永远不变。

#13


15  

If you know list size:

如果你知道列表的大小:

def SplitList(list, chunk_size):
    return [list[offs:offs+chunk_size] for offs in range(0, len(list), chunk_size)]

If you don't (an iterator):

如果您不(迭代器):

def IterChunks(sequence, chunk_size):
    res = []
    for item in sequence:
        res.append(item)
        if len(res) >= chunk_size:
            yield res
            res = []
    if res:
        yield res  # yield the last, incomplete, portion

In the latter case, it can be rephrased in a more beautiful way if you can be sure that the sequence always contains a whole number of chunks of given size (i.e. there is no incomplete last chunk).

在后一种情况下,如果您可以确信序列总是包含给定大小的整数块(也就是说没有不完整的最后一块),那么可以用更漂亮的方式来重新措辞。

#14


14  

A generator expression:

一个生成器表达式:

def chunks(seq, n):
    return (seq[i:i+n] for i in xrange(0, len(seq), n))

eg.

如。

print list(chunks(range(1, 1000), 10))

#15


13  

I like the Python doc's version proposed by tzot and J.F.Sebastian a lot, but it has two shortcomings:

我喜欢tzot和J.F.提出的Python文档版本。Sebastian有很多,但它有两个缺点:

  • it is not very explicit
  • 它不是很明确。
  • I usually don't want a fill value in the last chunk
  • 我通常不希望在最后一个块中有填充值。

I'm using this one a lot in my code:

我在我的代码中使用了很多

from itertools import islice

def chunks(n, iterable):
    iterable = iter(iterable)
    while True:
        yield tuple(islice(iterable, n)) or iterable.next()

UPDATE: A lazy chunks version:

更新:一个懒惰的块版本:

from itertools import chain, islice

def chunks(n, iterable):
   iterable = iter(iterable)
   while True:
       yield chain([next(iterable)], islice(iterable, n-1))

#16


10  

At this point, I think we need a recursive generator, just in case...

在这一点上,我认为我们需要一个递归生成器,以防万一……

In python 2:

在python中2:

def chunks(li, n):
    if li == []:
        return
    yield li[:n]
    for e in chunks(li[n:], n):
        yield e

In python 3:

在python 3:

def chunks(li, n):
    if li == []:
        return
    yield li[:n]
    yield from chunks(li[n:], n)

Also, in case of massive Alien invasion, a decorated recursive generator might become handy:

此外,在大规模外星人入侵的情况下,一个装饰过的递归发电机可能会派上用场:

def dec(gen):
    def new_gen(li, n):
        for e in gen(li, n):
            if e == []:
                return
            yield e
    return new_gen

@dec
def chunks(li, n):
    yield li[:n]
    for e in chunks(li[n:], n):
        yield e

#17


9  

The toolz library has the partition function for this:

toolz库有这个分区函数:

from toolz.itertoolz.core import partition

list(partition(2, [1, 2, 3, 4]))
[(1, 2), (3, 4)]

#18


8  

You may also use get_chunks function of utilspie library as:

您还可以使用utilspie库的get_chunks函数为:

>>> from utilspie import iterutils
>>> a = [1, 2, 3, 4, 5, 6, 7, 8, 9]

>>> list(iterutils.get_chunks(a, 5))
[[1, 2, 3, 4, 5], [6, 7, 8, 9]]

You can install utilspie via pip:

您可以通过pip安装utilspie:

sudo pip install utilspie

Disclaimer: I am the creator of utilspie library.

免责声明:我是utilspie库的创建者。

#19


7  

[AA[i:i+SS] for i in range(len(AA))[::SS]]

Where AA is array, SS is chunk size. For example:

其中AA是数组,SS是块大小。例如:

>>> AA=range(10,21);SS=3
>>> [AA[i:i+SS] for i in range(len(AA))[::SS]]
[[10, 11, 12], [13, 14, 15], [16, 17, 18], [19, 20]]
# or [range(10, 13), range(13, 16), range(16, 19), range(19, 21)] in py3

#20


6  

def split_seq(seq, num_pieces):
    start = 0
    for i in xrange(num_pieces):
        stop = start + len(seq[i::num_pieces])
        yield seq[start:stop]
        start = stop

usage:

用法:

seq = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

for seq in split_seq(seq, 3):
    print seq

#21


6  

code:

代码:

def split_list(the_list, chunk_size):
    result_list = []
    while the_list:
        result_list.append(the_list[:chunk_size])
        the_list = the_list[chunk_size:]
    return result_list

a_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

print split_list(a_list, 3)

result:

结果:

[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]

#22


6  

I was curious about the performance of different approaches and here it is:

我对不同方法的表现很好奇,这里是:

Tested on Python 3.5.1

Python测试3.5.1

import time
batch_size = 7
arr_len = 298937

#---------slice-------------

print("\r\nslice")
start = time.time()
arr = [i for i in range(0, arr_len)]
while True:
    if not arr:
        break

    tmp = arr[0:batch_size]
    arr = arr[batch_size:-1]
print(time.time() - start)

#-----------index-----------

print("\r\nindex")
arr = [i for i in range(0, arr_len)]
start = time.time()
for i in range(0, round(len(arr) / batch_size + 1)):
    tmp = arr[batch_size * i : batch_size * (i + 1)]
print(time.time() - start)

#----------batches 1------------

def batch(iterable, n=1):
    l = len(iterable)
    for ndx in range(0, l, n):
        yield iterable[ndx:min(ndx + n, l)]

print("\r\nbatches 1")
arr = [i for i in range(0, arr_len)]
start = time.time()
for x in batch(arr, batch_size):
    tmp = x
print(time.time() - start)

#----------batches 2------------

from itertools import islice, chain

def batch(iterable, size):
    sourceiter = iter(iterable)
    while True:
        batchiter = islice(sourceiter, size)
        yield chain([next(batchiter)], batchiter)


print("\r\nbatches 2")
arr = [i for i in range(0, arr_len)]
start = time.time()
for x in batch(arr, batch_size):
    tmp = x
print(time.time() - start)

#---------chunks-------------
def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in range(0, len(l), n):
        yield l[i:i + n]
print("\r\nchunks")
arr = [i for i in range(0, arr_len)]
start = time.time()
for x in chunks(arr, batch_size):
    tmp = x
print(time.time() - start)

#-----------grouper-----------

from itertools import zip_longest # for Python 3.x
#from six.moves import zip_longest # for both (uses the six compat library)

def grouper(iterable, n, padvalue=None):
    "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
    return zip_longest(*[iter(iterable)]*n, fillvalue=padvalue)

arr = [i for i in range(0, arr_len)]
print("\r\ngrouper")
start = time.time()
for x in grouper(arr, batch_size):
    tmp = x
print(time.time() - start)

Results:

结果:

slice
31.18285083770752

index
0.02184295654296875

batches 1
0.03503894805908203

batches 2
0.22681021690368652

chunks
0.019841909408569336

grouper
0.006506919860839844

#23


5  

heh, one line version

嘿,一行版本

In [48]: chunk = lambda ulist, step:  map(lambda i: ulist[i:i+step],  xrange(0, len(ulist), step))

In [49]: chunk(range(1,100), 10)
Out[49]: 
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
 [11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
 [21, 22, 23, 24, 25, 26, 27, 28, 29, 30],
 [31, 32, 33, 34, 35, 36, 37, 38, 39, 40],
 [41, 42, 43, 44, 45, 46, 47, 48, 49, 50],
 [51, 52, 53, 54, 55, 56, 57, 58, 59, 60],
 [61, 62, 63, 64, 65, 66, 67, 68, 69, 70],
 [71, 72, 73, 74, 75, 76, 77, 78, 79, 80],
 [81, 82, 83, 84, 85, 86, 87, 88, 89, 90],
 [91, 92, 93, 94, 95, 96, 97, 98, 99]]

#24


5  

Another more explicit version.

另一个更明确的版本。

def chunkList(initialList, chunkSize):
    """
    This function chunks a list into sub lists 
    that have a length equals to chunkSize.

    Example:
    lst = [3, 4, 9, 7, 1, 1, 2, 3]
    print(chunkList(lst, 3)) 
    returns
    [[3, 4, 9], [7, 1, 1], [2, 3]]
    """
    finalList = []
    for i in range(0, len(initialList), chunkSize):
        finalList.append(initialList[i:i+chunkSize])
    return finalList

#25


4  

Consider using matplotlib.cbook pieces

考虑使用matplotlib。图书贝贝碎片

for example:

例如:

import matplotlib.cbook as cbook
segments = cbook.pieces(np.arange(20), 3)
for s in segments:
     print s

#26


4  

a = [1, 2, 3, 4, 5, 6, 7, 8, 9]
CHUNK = 4
[a[i*CHUNK:(i+1)*CHUNK] for i in xrange((len(a) + CHUNK - 1) / CHUNK )]

#27


4  

Without calling len() which is good for large lists:

不调用len()对大列表有好处:

def splitter(l, n):
    i = 0
    chunk = l[:n]
    while chunk:
        yield chunk
        i += n
        chunk = l[i:i+n]

And this is for iterables:

这是用于迭代的:

def isplitter(l, n):
    l = iter(l)
    chunk = list(islice(l, n))
    while chunk:
        yield chunk
        chunk = list(islice(l, n))

The functional flavour of the above:

上面的功能味道:

def isplitter2(l, n):
    return takewhile(bool,
                     (tuple(islice(start, n))
                            for start in repeat(iter(l))))

OR:

或者:

def chunks_gen_sentinel(n, seq):
    continuous_slices = imap(islice, repeat(iter(seq)), repeat(0), repeat(n))
    return iter(imap(tuple, continuous_slices).next,())

OR:

或者:

def chunks_gen_filter(n, seq):
    continuous_slices = imap(islice, repeat(iter(seq)), repeat(0), repeat(n))
    return takewhile(bool,imap(tuple, continuous_slices))

#28


3  

def chunks(iterable,n):
    """assumes n is an integer>0
    """
    iterable=iter(iterable)
    while True:
        result=[]
        for i in range(n):
            try:
                a=next(iterable)
            except StopIteration:
                break
            else:
                result.append(a)
        if result:
            yield result
        else:
            break

g1=(i*i for i in range(10))
g2=chunks(g1,3)
print g2
'<generator object chunks at 0x0337B9B8>'
print list(g2)
'[[0, 1, 4], [9, 16, 25], [36, 49, 64], [81]]'

#29


3  

I realise this question is old (stumbled over it on Google), but surely something like the following is far simpler and clearer than any of the huge complex suggestions and only uses slicing:

我意识到这个问题已经过时了(在谷歌上被发现了),但是像下面这样的事情比任何一个复杂的建议都要简单得多,而且只使用切片:

def chunker(iterable, chunksize):
    for i,c in enumerate(iterable[::chunksize]):
        yield iterable[i*chunksize:(i+1)*chunksize]

>>> for chunk in chunker(range(0,100), 10):
...     print list(chunk)
... 
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
[20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
... etc ...

#30


3  

See this reference

看到这个参考

>>> orange = range(1, 1001)
>>> otuples = list( zip(*[iter(orange)]*10))
>>> print(otuples)
[(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), ... (991, 992, 993, 994, 995, 996, 997, 998, 999, 1000)]
>>> olist = [list(i) for i in otuples]
>>> print(olist)
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], ..., [991, 992, 993, 994, 995, 996, 997, 998, 999, 1000]]
>>> 

Python3

Python3