条件计数:使用SUM()和COUNT()的性能差异?

时间:2021-03-29 20:09:44

Just as a very simple example, let's say I have table test with sample data like so:

就像一个非常简单的例子,假设我有样本数据的表测试,如下所示:

a     |     b      
-------------
1     |    18
1     |    24
1     |    64
1     |    82
1     |    10
1     |     7
2     |     5
2     |    18
2     |    66
2     |    72
3     |    81
3     |    97

And for each a, I'm to get the count of how many b's there are that are < 50. The result would look like:

对于每一个a,我要得到有多少b的数量<50。结果看起来像:

a     |   bcnt
--------------
1     |      4
2     |      2
3     |      0

Now I could achieve this result in either of two ways:

现在我可以用两种方式中的任何一种来实现这个结果:

SELECT a, COUNT(CASE WHEN b < 50 THEN 1 ELSE NULL END) AS bcnt
FROM test
GROUP BY a

Or:

要么:

SELECT a, SUM(CASE WHEN b < 50 THEN 1 ELSE 0 END) AS bcnt
FROM test
GROUP BY a

I know this may seem like such an insignificant trivial matter, but my question is would there be any advantage (however so slight) in using one approach over the other in terms of: Performance?... How many other DBMSs they would work in?... Clarity of statement?... etc.

我知道这可能看起来像是一个微不足道的小问题,但我的问题是,在使用一种方法而不是另一种方法方面会有任何优势(无论多么轻微):性能?......他们将使用多少个其他DBMS ?......陈述的清晰度......等等

4 个解决方案

#1


7  

Performance?

性能?

Oh, the difference, if any, would be marginal, I'm sure. It would be nothing for me to worry about.

哦,差别,如果有的话,将是微不足道的,我敢肯定。对我来说没什么好担心的。

How many other DBMSs they would work in?

他们将使用多少个其他DBMS?

I've no doubt both would work in any major SQL product at least, so, again, this wouldn't be a matter of concern, not to me anyway.

毫无疑问,至少两者都可以在任何主要的SQL产品中使用,所以,这不是一个值得关注的问题,不管怎样对我来说都不是。

Clarity of statement?

陈述的清晰度?

Certainly COUNT expresses it clearer that you want to count things, not to add up some arbitrary values. With SUM, you would realise the actual intention only upon reaching the THEN 1 part after skimming through the condition.

当然COUNT表示你想要计算事物更清楚,而不是添加一些任意值。使用SUM,只有在浏览条件后才能达到THEN 1部分才能实现实际意图。

Also, if I use COUNT I can omit the ELSE NULL part, because that's what is implied when ELSE is absent. If I omit ELSE 0 in the SUM expression, I may end up with a NULL result instead of the probably expected 0.

另外,如果我使用COUNT,我可以省略ELSE NULL部分,因为这是ELSE不存在时隐含的内容。如果我在SUM表达式中省略了ELSE 0,我可能会得到一个NULL结果而不是可能期望的0。

On the other hand, there may be quite opposite situations where it would be more convenient to return NULL instead of 0 as a result of counting. So, if I used COUNT, I would have to do something like NULLIF(COUNT(CASE ...), 0), while with SUM(CASE ...) it would be just enough to leave out the ELSE clause. But even in that case I might still prefer the somewhat longer clarity to the slightly more obscure brevity (other things being equal).

另一方面,可能存在相反的情况,其中由于计数而返回NULL而不是0将更方便。所以,如果我使用COUNT,我将不得不做一些类似NULLIF(COUNT(CASE ...),0)的东西,而使用SUM(CASE ...),它就足以省略ELSE子句。但即使在这种情况下,我仍然可能更喜欢稍微更清晰一点的简洁(其他条件相同)。

#2


3  

Personally, I would use

就个人而言,我会用

select a, count(b)
  from test
 where b < 50
 group by a

Clear, concise and according to this SQL fiddle a tiny bit quicker than the others (needs less data according to the execution plan, though with a table that small you won't notice a difference):

清晰,简洁,根据这个SQL,比其他人更快一点(根据执行计划需要更少的数据,尽管有一个很小的表,你不会注意到差异):

#3


2  

Whats wrong with a where clause:

where子句有什么问题:

select a, count(b)
from test
where b < 50
group by a

#4


-1  

With COUNT, you count the elements, using SUM you add numbers (positive, negative or zero) for a result that can be negative.

使用COUNT计算元素,使用SUM为可能为负的结果添加数字(正数,负数或零)。

#1


7  

Performance?

性能?

Oh, the difference, if any, would be marginal, I'm sure. It would be nothing for me to worry about.

哦,差别,如果有的话,将是微不足道的,我敢肯定。对我来说没什么好担心的。

How many other DBMSs they would work in?

他们将使用多少个其他DBMS?

I've no doubt both would work in any major SQL product at least, so, again, this wouldn't be a matter of concern, not to me anyway.

毫无疑问,至少两者都可以在任何主要的SQL产品中使用,所以,这不是一个值得关注的问题,不管怎样对我来说都不是。

Clarity of statement?

陈述的清晰度?

Certainly COUNT expresses it clearer that you want to count things, not to add up some arbitrary values. With SUM, you would realise the actual intention only upon reaching the THEN 1 part after skimming through the condition.

当然COUNT表示你想要计算事物更清楚,而不是添加一些任意值。使用SUM,只有在浏览条件后才能达到THEN 1部分才能实现实际意图。

Also, if I use COUNT I can omit the ELSE NULL part, because that's what is implied when ELSE is absent. If I omit ELSE 0 in the SUM expression, I may end up with a NULL result instead of the probably expected 0.

另外,如果我使用COUNT,我可以省略ELSE NULL部分,因为这是ELSE不存在时隐含的内容。如果我在SUM表达式中省略了ELSE 0,我可能会得到一个NULL结果而不是可能期望的0。

On the other hand, there may be quite opposite situations where it would be more convenient to return NULL instead of 0 as a result of counting. So, if I used COUNT, I would have to do something like NULLIF(COUNT(CASE ...), 0), while with SUM(CASE ...) it would be just enough to leave out the ELSE clause. But even in that case I might still prefer the somewhat longer clarity to the slightly more obscure brevity (other things being equal).

另一方面,可能存在相反的情况,其中由于计数而返回NULL而不是0将更方便。所以,如果我使用COUNT,我将不得不做一些类似NULLIF(COUNT(CASE ...),0)的东西,而使用SUM(CASE ...),它就足以省略ELSE子句。但即使在这种情况下,我仍然可能更喜欢稍微更清晰一点的简洁(其他条件相同)。

#2


3  

Personally, I would use

就个人而言,我会用

select a, count(b)
  from test
 where b < 50
 group by a

Clear, concise and according to this SQL fiddle a tiny bit quicker than the others (needs less data according to the execution plan, though with a table that small you won't notice a difference):

清晰,简洁,根据这个SQL,比其他人更快一点(根据执行计划需要更少的数据,尽管有一个很小的表,你不会注意到差异):

#3


2  

Whats wrong with a where clause:

where子句有什么问题:

select a, count(b)
from test
where b < 50
group by a

#4


-1  

With COUNT, you count the elements, using SUM you add numbers (positive, negative or zero) for a result that can be negative.

使用COUNT计算元素,使用SUM为可能为负的结果添加数字(正数,负数或零)。