需要一种快速的方法来计数和求和一个迭代在一个单次。

时间:2022-09-01 21:21:40

Can any one help me? I'm trying to come up with a way to compute

有人能帮我吗?我想想出一个计算的方法

>>> sum_widths = sum(col.width for col in cols if not col.hide)

and also count the number of items in this sum, without having to make two passes over cols.

也要计算这个总数中项目的数量,而不需要经过两次cols。

It seems unbelievable but after scanning the std-lib (built-in functions, itertools, functools, etc), I couldn't even find a function which would count the number of members in an iterable. I found the function itertools.count, which sounds like what I want, but It's really just a deceptively named range function.

这看起来令人难以置信,但是在扫描完std-lib(内置函数、迭代工具、函数工具等)之后,我甚至找不到一个函数来计算一个可迭代的成员数量。我找到了函数itertools。计数,这听起来像我想要的,但它实际上只是一个伪装的范围函数。

After a little thought I came up with the following (which is so simple that the lack of a library function may be excusable, except for its obtuseness):

经过一番思考,我想到了以下几点(这一点非常简单,除了它的迟钝之外,缺乏库函数是可以原谅的):

>>> visable_col_count = sum(col is col for col in cols if not col.hide)

However, using these two functions requires two passes of the iterable, which just rubs me the wrong way.

然而,使用这两个函数需要遍历两次,这只会让我感到不对劲。

As an alternative, the following function does what I want:

作为替代方案,下面的函数实现了我想要的:

>>> def count_and_sum(iter):
>>>     count = sum = 0
>>>     for item in iter:
>>>         count += 1
>>>         sum += item
>>>     return count, sum

The problem with this is that it takes 100 times as long (according to timeit) as the sum of a generator expression form.

这样做的问题是,它花费的时间(根据时间)是生成器表达式表单的总和的100倍。

If anybody can come up with a simple one-liner which does what I want, please let me know (using Python 3.3).

如果有人能想出一个简单的一行程序来做我想做的事情,请让我知道(使用Python 3.3)。

Edit 1

编辑1

Lots of great ideas here, guys. Thanks to all who replied. It will take me a while to digest all these answers, but I will and I will try to pick one to check.

这里有很多很棒的点子。谢谢大家的回复。我需要一段时间来消化所有这些答案,但我会和我将尝试挑选一个来检查。

Edit 2

编辑2

I repeated the timings on my two humble suggestions (count_and_sum function and 2 separate sum functions) and discovered that my original timing was way off, probably due to an auto-scheduled backup process running in the background.

我重复了我的两个建议(count_and_sum函数和两个单独的sum函数)的计时,发现我最初的计时错误,可能是由于后台运行的自动调度备份进程。

I also timed most of the excellent suggestions given as answers here, all with the same model. Analysing these answers has been quite an education for me: new uses for deque, enumerate and reduce and first time for count and accumulate. Thanks to all!

我还把这里提供的大多数优秀建议作为答案进行了计时,所有的建议都采用了相同的模式。分析这些答案对我来说是一种教育:deque的新用途,列举和减少,第一次计数和积累。感谢所有!

Here are the results (from my slow netbook) using the software I'm developing for display:

以下是使用我正在开发的用于显示的软件(来自我的慢上网本)的结果:

┌───────────────────────────────────────────────────────┐
│                 Count and Sum Timing                  │
├──────────────────────────┬───────────┬────────────────┤
│          Method          │Time (usec)│Time (% of base)│
├──────────────────────────┼───────────┼────────────────┤
│count_and_sum (base)      │        7.2│            100%│
│Two sums                  │        7.5│            104%│
│deque enumerate accumulate│        7.3│            101%│
│max enumerate accumulate  │        7.3│            101%│
│reduce                    │        7.4│            103%│
│count sum                 │        7.3│            101%│
└──────────────────────────┴───────────┴────────────────┘

(I didn't time the complex and fold methods as being just too obscure, but thanks anyway.)

(我没觉得复杂的折叠方法太模糊,但还是谢谢你。)

Since there's very little difference in timing among all these methods I decided to use the count_and_sum function (with an explicit for loop) as being the most readable, explicit and simple (Python Zen) and it also happens to be the fastest!

由于所有这些方法之间的时间差别很小,所以我决定使用count_and_sum函数(带有显式的for循环)作为最易读、最显式和最简单的函数(Python Zen),而且它碰巧也是最快的!

I wish I could accept one of these amazing answers as correct but they are all equally good though more or less obscure, so I'm just up-voting everybody and accepting my own answer as correct (count_and_sum function) since that's what I'm using.

我希望我能接受这些令人惊叹的答案中的一个是正确的,但它们都是一样的好,尽管或多或少有些模糊,所以我只是对每个人进行向上投票,并接受我自己的答案为正确的(count_and_sum函数),因为这就是我所使用的。

What was that about "There should be one-- and preferably only one --obvious way to do it."?

“应该有一种——最好只有一种——显而易见的方式,”这是什么意思?

13 个解决方案

#1


100  

Using complex numbers

使用复数

z = [1, 2, 4, 5, 6]
y = sum(x + 1j for x in z)
sum_z, count_z = y.real, int(y.imag)
print sum_z, count_z
18.0 5

#2


49  

I don't know about speed, but this is kind of pretty:

我不知道速度,但这很漂亮

>>> from itertools import accumulate
>>> it = range(10)
>>> max(enumerate(accumulate(it), 1))
(10, 45)

#3


28  

Adaption of DSM's answer. using deque(... maxlen=1) to save memory use.

适应DSM的答案。使用双端队列(…maxlen=1)保存内存使用。

import itertools 
from collections import deque 
deque(enumerate(itertools.accumulate(x), 1), maxlen=1)

timing code in ipython:

在ipython计时代码:

import itertools , random
from collections import deque 

def count_and_sum(iter):
     count = sum = 0
     for item in iter:
         count += 1
         sum += item
     return count, sum

X = [random.randint(0, 10) for _ in range(10**7)]
%timeit count_and_sum(X)
%timeit deque(enumerate(itertools.accumulate(X), 1), maxlen=1)
%timeit (max(enumerate(itertools.accumulate(X), 1)))

results: now faster than OP's method

结果:比OP方法快

1 loops, best of 3: 1.08 s per loop
1 loops, best of 3: 659 ms per loop
1 loops, best of 3: 1.19 s per loop

#4


23  

Here's some timing data that might be of interest:

以下是一些可能值得关注的时间数据:

import timeit

setup = '''
import random, functools, itertools, collections

x = [random.randint(0, 10) for _ in range(10**5)]

def count_and_sum(it):
    c, s = 0, 0
    for i in it:
        c += 1
        s += i
    return c, s

def two_pass(it):
    return sum(i for i in it), sum(True for i in it)

def functional(it):
    return functools.reduce(lambda pair, x: (pair[0]+1, pair[1]+x), it, [0, 0])

def accumulator(it):
    return max(enumerate(itertools.accumulate(it), 1))

def complex(it):
    cpx = sum(x + 1j for x in it)
    return cpx.real, int(cpx.imag)

def dequed(it):
    return collections.deque(enumerate(itertools.accumulate(it), 1), maxlen=1)

'''

number = 100
for stmt in ['count_and_sum(x)',
             'two_pass(x)',
             'functional(x)',
             'accumulator(x)',
             'complex(x)',
             'dequed(x)']:
    print('{:.4}'.format(timeit.timeit(stmt=stmt, setup=setup, number=number)))

Result:

结果:

3.404 # OP's one-pass method
3.833 # OP's two-pass method
8.405 # Timothy Shields's fold method
3.892 # DSM's accumulate-based method
4.946 # 1_CR's complex-number method
2.002 # M4rtini's deque-based modification of DSM's method

Given these results, I'm not really sure how the OP is seeing a 100x slowdown with the one-pass method. Even if the data looks radically different from a list of random integers, that just shouldn't happen.

考虑到这些结果,我不太确定OP是如何通过one-pass方法看到100x的放缓的。即使数据看起来与随机整数列表完全不同,这也不应该发生。

Also, M4rtini's solution looks like the clear winner.

此外,M4rtini的解决方案看起来是明显的赢家。


To clarify, these results are in CPython 3.2.3. For a comparison to PyPy3, see James_pic's answer, which shows some serious gains from JIT compilation for some methods (also mentioned in a comment by M4rtini.

澄清一下,这些结果在CPython 3.2.3中。要与PyPy3进行比较,请参见James_pic的答案,它显示了一些方法的JIT编译的一些重要收获(M4rtini的评论中也提到了这一点)。

#5


19  

As a follow-up to senshin's answer, it's worth noting that the performance differences are largely due to quirks in CPython's implementation, that make some methods slower than others (for example, for loops are relatively slow in CPython). I thought it would be interesting to try the exact same test in PyPy (using PyPy3 2.1 beta), which has different performance characteristics. In PyPy the results are:

作为senshin回答的后续,值得注意的是,性能差异很大程度上是由于CPython实现中的怪癖,这使得一些方法比其他方法慢(例如,在CPython中for循环相对慢)。我认为在PyPy(使用PyPy3 2.1 beta)中尝试相同的测试是很有趣的,它具有不同的性能特征。在PyPy中,结果如下:

0.6227 # OP's one-pass method
0.8714 # OP's two-pass method
1.033 # Timothy Shields's fold method
6.354 # DSM's accumulate-based method
1.287 # 1_CR's complex-number method
3.857 # M4rtini's deque-based modification of DSM's method

In this case, the OP's one-pass method is fastest. This makes sense, as it's arguably the simplest (at least from a compiler's point of view) and PyPy can eliminate many of the overheads by inlining method calls, which CPython can't.

在这种情况下,OP的单程方法是最快的。这是有道理的,因为它可以说是最简单的(至少从编译器的角度来看),PyPy可以通过内联方法调用来消除许多开销,而CPython不能。

For comparison, CPython 3.3.2 on my machine give the following:

为了比较,我的机器上的CPython 3.3.2给出了以下内容:

1.651 # OP's one-pass method
1.825 # OP's two-pass method
3.258 # Timothy Shields's fold method
1.684 # DSM's accumulate-based method
3.072 # 1_CR's complex-number method
1.191 # M4rtini's deque-based modification of DSM's method

#6


5  

You can keep count inside the sum with tricks similar to this

您可以使用类似的技巧在和中保持计数

>>> from itertools import count
>>> cnt = count()
>>> sum((next(cnt), x)[1] for x in range(10) if x%2)
25
>>> next(cnt)
5

But it's probably going to be more readable to just use a for loop

但是使用for循环可能更容易读懂

#7


4  

You can use this:

您可以使用:

from itertools import count

lst = range(10)
c = count(1)
tot = sum(next(c) and x for x in lst if x % 2)
n = next(c)-1
print(n, tot)

# 5 25

It's kind of a hack, but it works perfectly well.

这是一种技巧,但效果很好。

#8


3  

I don't know what the Python syntax is off hand, but you could potentially use a fold. Something like this:

我不知道Python语法是什么,但是您可以使用折叠。是这样的:

(count, total) = fold((0, 0), lambda pair, x: (pair[0] + 1, pair[1] + x))

The idea is to use a seed of (0,0) and then at each step add 1 to the first component and the current number to the second component.

其思想是使用(0,0)的种子,然后在每一步将1添加到第一个组件,当前数字添加到第二个组件。

For comparison, you could implement sum as a fold as follows:

为了便于比较,可以将sum实现为如下折法:

total = fold(0, lambda t, x: t + x)

#9


3  

1_CR's complex number solution is cute but overly hacky. The reason it works is that a complex number is a 2-tuple, that sums elementwise. The same is true of numpy arrays and I think it's slightly cleaner to use those:

1_CR的复数解决方案很可爱,但过于陈腐。它工作的原因是一个复数是一个2元组,这个数字是元素的总和。numpy数组也是如此,我认为使用这些数组比较简洁:

import numpy as np
z = [1, 2, 4, 5, 6]
y = sum(np.array([x, 1]) for x in z)
sum_z, count_z = y[0], y[1]
print sum_z, count_z
18 5

#10


3  

Something else to consider: If it is possible to determine a minimum possible count, we can let the efficient built-in sum do part of the work:

其他需要考虑的事情:如果可以确定最小可能的数量,我们可以让有效的内置和完成部分工作:

from itertools import islice

def count_and_sum(iterable):
    # insert favorite implementation here

def count_and_sum_with_min_count(iterable, min_count):
    iterator = iter(iterable)
    slice_sum = sum(islice(iterator, None, min_count))
    rest_count, rest_sum = count_and_sum(iterator)
    return min_count + rest_count, slice_sum + rest_sum

For example, using the deque method with a sequence of 10000000 items and min_count of 5000000, timing results are:

例如,使用序列为10000000和min_count为5000000的deque方法,计时结果为:

count_and_sum: 1.03
count_and_sum_with_min_count: 0.63

#11


2  

How about this? It seems to work.

这个怎么样?这似乎很奏效。

from functools import reduce

class Column:
    def __init__(self, width, hide):
        self.width = width
        self.hide = hide

lst = [Column(10, False), Column(100, False), Column(1000, True), Column(10000, False)]

print(reduce(lambda acc, col: Column(col.width + acc.width, False) if not col.hide else acc, lst, Column(0, False)).width)

#12


2  

You might only need the sum & count today, but who knows what you'll need tomorrow!

你今天可能只需要计算总数,但谁知道你明天需要什么呢!

Here's an easily extensible solution:

这里有一个易于扩展的解决方案:

def fold_parallel(itr, **fs):
    res = {
        k: zero for k, (zero, f) in fs.items()
    }

    for x in itr:
        for k, (_, f) in fs.items():
            res[k] = f(res[k], x)

    return res

from operator import add

print(fold_parallel([1, 2, 3],
    count = (0, lambda a, b: a + 1),
    sum = (0, add),
))
# {'count': 3, 'sum': 6}

#13


2  

Thanks for all the great answers, but I decided to use my original count_and_sum function, called as follows:

谢谢大家的解答,但是我决定使用我最初的count_and_sum函数,如下所示:

>>> cc, cs = count_and_sum(c.width for c in cols if not c.hide) 

As explained in the edits to my original question this turned out to be the fastest and most readable solution.

正如对我最初的问题的编辑所解释的那样,这是最快也是最易读的解决方案。

#1


100  

Using complex numbers

使用复数

z = [1, 2, 4, 5, 6]
y = sum(x + 1j for x in z)
sum_z, count_z = y.real, int(y.imag)
print sum_z, count_z
18.0 5

#2


49  

I don't know about speed, but this is kind of pretty:

我不知道速度,但这很漂亮

>>> from itertools import accumulate
>>> it = range(10)
>>> max(enumerate(accumulate(it), 1))
(10, 45)

#3


28  

Adaption of DSM's answer. using deque(... maxlen=1) to save memory use.

适应DSM的答案。使用双端队列(…maxlen=1)保存内存使用。

import itertools 
from collections import deque 
deque(enumerate(itertools.accumulate(x), 1), maxlen=1)

timing code in ipython:

在ipython计时代码:

import itertools , random
from collections import deque 

def count_and_sum(iter):
     count = sum = 0
     for item in iter:
         count += 1
         sum += item
     return count, sum

X = [random.randint(0, 10) for _ in range(10**7)]
%timeit count_and_sum(X)
%timeit deque(enumerate(itertools.accumulate(X), 1), maxlen=1)
%timeit (max(enumerate(itertools.accumulate(X), 1)))

results: now faster than OP's method

结果:比OP方法快

1 loops, best of 3: 1.08 s per loop
1 loops, best of 3: 659 ms per loop
1 loops, best of 3: 1.19 s per loop

#4


23  

Here's some timing data that might be of interest:

以下是一些可能值得关注的时间数据:

import timeit

setup = '''
import random, functools, itertools, collections

x = [random.randint(0, 10) for _ in range(10**5)]

def count_and_sum(it):
    c, s = 0, 0
    for i in it:
        c += 1
        s += i
    return c, s

def two_pass(it):
    return sum(i for i in it), sum(True for i in it)

def functional(it):
    return functools.reduce(lambda pair, x: (pair[0]+1, pair[1]+x), it, [0, 0])

def accumulator(it):
    return max(enumerate(itertools.accumulate(it), 1))

def complex(it):
    cpx = sum(x + 1j for x in it)
    return cpx.real, int(cpx.imag)

def dequed(it):
    return collections.deque(enumerate(itertools.accumulate(it), 1), maxlen=1)

'''

number = 100
for stmt in ['count_and_sum(x)',
             'two_pass(x)',
             'functional(x)',
             'accumulator(x)',
             'complex(x)',
             'dequed(x)']:
    print('{:.4}'.format(timeit.timeit(stmt=stmt, setup=setup, number=number)))

Result:

结果:

3.404 # OP's one-pass method
3.833 # OP's two-pass method
8.405 # Timothy Shields's fold method
3.892 # DSM's accumulate-based method
4.946 # 1_CR's complex-number method
2.002 # M4rtini's deque-based modification of DSM's method

Given these results, I'm not really sure how the OP is seeing a 100x slowdown with the one-pass method. Even if the data looks radically different from a list of random integers, that just shouldn't happen.

考虑到这些结果,我不太确定OP是如何通过one-pass方法看到100x的放缓的。即使数据看起来与随机整数列表完全不同,这也不应该发生。

Also, M4rtini's solution looks like the clear winner.

此外,M4rtini的解决方案看起来是明显的赢家。


To clarify, these results are in CPython 3.2.3. For a comparison to PyPy3, see James_pic's answer, which shows some serious gains from JIT compilation for some methods (also mentioned in a comment by M4rtini.

澄清一下,这些结果在CPython 3.2.3中。要与PyPy3进行比较,请参见James_pic的答案,它显示了一些方法的JIT编译的一些重要收获(M4rtini的评论中也提到了这一点)。

#5


19  

As a follow-up to senshin's answer, it's worth noting that the performance differences are largely due to quirks in CPython's implementation, that make some methods slower than others (for example, for loops are relatively slow in CPython). I thought it would be interesting to try the exact same test in PyPy (using PyPy3 2.1 beta), which has different performance characteristics. In PyPy the results are:

作为senshin回答的后续,值得注意的是,性能差异很大程度上是由于CPython实现中的怪癖,这使得一些方法比其他方法慢(例如,在CPython中for循环相对慢)。我认为在PyPy(使用PyPy3 2.1 beta)中尝试相同的测试是很有趣的,它具有不同的性能特征。在PyPy中,结果如下:

0.6227 # OP's one-pass method
0.8714 # OP's two-pass method
1.033 # Timothy Shields's fold method
6.354 # DSM's accumulate-based method
1.287 # 1_CR's complex-number method
3.857 # M4rtini's deque-based modification of DSM's method

In this case, the OP's one-pass method is fastest. This makes sense, as it's arguably the simplest (at least from a compiler's point of view) and PyPy can eliminate many of the overheads by inlining method calls, which CPython can't.

在这种情况下,OP的单程方法是最快的。这是有道理的,因为它可以说是最简单的(至少从编译器的角度来看),PyPy可以通过内联方法调用来消除许多开销,而CPython不能。

For comparison, CPython 3.3.2 on my machine give the following:

为了比较,我的机器上的CPython 3.3.2给出了以下内容:

1.651 # OP's one-pass method
1.825 # OP's two-pass method
3.258 # Timothy Shields's fold method
1.684 # DSM's accumulate-based method
3.072 # 1_CR's complex-number method
1.191 # M4rtini's deque-based modification of DSM's method

#6


5  

You can keep count inside the sum with tricks similar to this

您可以使用类似的技巧在和中保持计数

>>> from itertools import count
>>> cnt = count()
>>> sum((next(cnt), x)[1] for x in range(10) if x%2)
25
>>> next(cnt)
5

But it's probably going to be more readable to just use a for loop

但是使用for循环可能更容易读懂

#7


4  

You can use this:

您可以使用:

from itertools import count

lst = range(10)
c = count(1)
tot = sum(next(c) and x for x in lst if x % 2)
n = next(c)-1
print(n, tot)

# 5 25

It's kind of a hack, but it works perfectly well.

这是一种技巧,但效果很好。

#8


3  

I don't know what the Python syntax is off hand, but you could potentially use a fold. Something like this:

我不知道Python语法是什么,但是您可以使用折叠。是这样的:

(count, total) = fold((0, 0), lambda pair, x: (pair[0] + 1, pair[1] + x))

The idea is to use a seed of (0,0) and then at each step add 1 to the first component and the current number to the second component.

其思想是使用(0,0)的种子,然后在每一步将1添加到第一个组件,当前数字添加到第二个组件。

For comparison, you could implement sum as a fold as follows:

为了便于比较,可以将sum实现为如下折法:

total = fold(0, lambda t, x: t + x)

#9


3  

1_CR's complex number solution is cute but overly hacky. The reason it works is that a complex number is a 2-tuple, that sums elementwise. The same is true of numpy arrays and I think it's slightly cleaner to use those:

1_CR的复数解决方案很可爱,但过于陈腐。它工作的原因是一个复数是一个2元组,这个数字是元素的总和。numpy数组也是如此,我认为使用这些数组比较简洁:

import numpy as np
z = [1, 2, 4, 5, 6]
y = sum(np.array([x, 1]) for x in z)
sum_z, count_z = y[0], y[1]
print sum_z, count_z
18 5

#10


3  

Something else to consider: If it is possible to determine a minimum possible count, we can let the efficient built-in sum do part of the work:

其他需要考虑的事情:如果可以确定最小可能的数量,我们可以让有效的内置和完成部分工作:

from itertools import islice

def count_and_sum(iterable):
    # insert favorite implementation here

def count_and_sum_with_min_count(iterable, min_count):
    iterator = iter(iterable)
    slice_sum = sum(islice(iterator, None, min_count))
    rest_count, rest_sum = count_and_sum(iterator)
    return min_count + rest_count, slice_sum + rest_sum

For example, using the deque method with a sequence of 10000000 items and min_count of 5000000, timing results are:

例如,使用序列为10000000和min_count为5000000的deque方法,计时结果为:

count_and_sum: 1.03
count_and_sum_with_min_count: 0.63

#11


2  

How about this? It seems to work.

这个怎么样?这似乎很奏效。

from functools import reduce

class Column:
    def __init__(self, width, hide):
        self.width = width
        self.hide = hide

lst = [Column(10, False), Column(100, False), Column(1000, True), Column(10000, False)]

print(reduce(lambda acc, col: Column(col.width + acc.width, False) if not col.hide else acc, lst, Column(0, False)).width)

#12


2  

You might only need the sum & count today, but who knows what you'll need tomorrow!

你今天可能只需要计算总数,但谁知道你明天需要什么呢!

Here's an easily extensible solution:

这里有一个易于扩展的解决方案:

def fold_parallel(itr, **fs):
    res = {
        k: zero for k, (zero, f) in fs.items()
    }

    for x in itr:
        for k, (_, f) in fs.items():
            res[k] = f(res[k], x)

    return res

from operator import add

print(fold_parallel([1, 2, 3],
    count = (0, lambda a, b: a + 1),
    sum = (0, add),
))
# {'count': 3, 'sum': 6}

#13


2  

Thanks for all the great answers, but I decided to use my original count_and_sum function, called as follows:

谢谢大家的解答,但是我决定使用我最初的count_and_sum函数,如下所示:

>>> cc, cs = count_and_sum(c.width for c in cols if not c.hide) 

As explained in the edits to my original question this turned out to be the fastest and most readable solution.

正如对我最初的问题的编辑所解释的那样,这是最快也是最易读的解决方案。