TSQL OVER子句:COUNT(*) OVER (ORDER BY a)

时间:2021-10-22 22:59:41

This is my code:

这是我的代码:

USE [tempdb];
GO

IF OBJECT_ID(N'dbo.t') IS NOT NULL
BEGIN
    DROP TABLE dbo.t
END
GO

CREATE TABLE dbo.t
(
    a NVARCHAR(8),
    b NVARCHAR(8)
);
GO

INSERT t VALUES ('a', 'b');
INSERT t VALUES ('a', 'b');
INSERT t VALUES ('a', 'b');
INSERT t VALUES ('c', 'd');
INSERT t VALUES ('c', 'd');
INSERT t VALUES ('c', 'd');
INSERT t VALUES ('c', 'd');
INSERT t VALUES ('e', NULL);
INSERT t VALUES (NULL, NULL);
INSERT t VALUES (NULL, NULL);
INSERT t VALUES (NULL, NULL);
INSERT t VALUES (NULL, NULL);
GO

SELECT  a, b,
    COUNT(*) OVER (ORDER BY a)
FROM    t;

On this page of BOL, Microsoft says that:

在BOL的这一页,微软说:

If PARTITION BY is not specified, the function treats all rows of the query result set as a single group.

如果未指定PARTITION BY,则函数将查询结果集的所有行视为一个组。

So based on my understanding, the last SELECT statement will give me the following result. Since all records are considered as in one single group, right?

根据我的理解,最后一个SELECT语句将给出如下结果。因为所有记录都被认为是在一个组中,对吧?

a        b        
-------- -------- -----------
NULL     NULL     12
NULL     NULL     12
NULL     NULL     12
NULL     NULL     12
a        b        12
a        b        12
a        b        12
c        d        12
c        d        12
c        d        12
c        d        12
e        NULL     12

But the actual result is:

但实际结果是:

a        b        
-------- -------- -----------
NULL     NULL     4
NULL     NULL     4
NULL     NULL     4
NULL     NULL     4
a        b        7
a        b        7
a        b        7
c        d        11
c        d        11
c        d        11
c        d        11
e        NULL     12

Anyone can help to explain why? Thanks.

有人能解释一下为什么吗?谢谢。

1 个解决方案

#1


28  

It gives a running total (this functionality was not implemented in SQL Server until version 2012.)

它给出一个运行总数(这个功能直到2012年版本才在SQL Server中实现)。

The ORDER BY defines the window to be aggregated with UNBOUNDED PRECEDING and CURRENT ROW as the default when not specified. SQL Server defaults to the less well performing RANGE option rather than ROWS.

ORDER BY定义了在未指定时使用*前行和当前行作为默认值聚合的窗口。SQL Server默认使用性能较差的范围选项,而不是行。

They have different semantics in the case of ties in that the window for the RANGE version includes not just the current row (and preceding rows) but also any additional tied rows with the same value of a as the current row. This can be seen in the number of rows counted by each in the results below.

它们在绑定的情况下有不同的语义,因为范围版本的窗口不仅包括当前行(和前面的行),还包括与当前行值相同的任何附加绑定行。这可以从下面的结果中的行数中看出。

SELECT  a, 
        b,
        COUNT(*) OVER (ORDER BY a 
                         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS  [Rows],
        COUNT(*) OVER (ORDER BY a 
                         RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS [Range],
        COUNT(*) OVER() AS [Over()]
    FROM    t;

Returns

返回

a        b        Rows        Range       Over()
-------- -------- ----------- ----------- -----------
NULL     NULL     1           4           12
NULL     NULL     2           4           12
NULL     NULL     3           4           12
NULL     NULL     4           4           12
a        b        5           7           12
a        b        6           7           12
a        b        7           7           12
c        d        8           11          12
c        d        9           11          12
c        d        10          11          12
c        d        11          11          12
e        NULL     12          12          12

To achieve the result that you were expecting to get omit both the PARTITION BY and ORDER BY and use an empty OVER() clause (also shown above).

为了实现预期的结果,您将省略分区BY和ORDER BY,并使用empty OVER()子句(也如上面所示)。

#1


28  

It gives a running total (this functionality was not implemented in SQL Server until version 2012.)

它给出一个运行总数(这个功能直到2012年版本才在SQL Server中实现)。

The ORDER BY defines the window to be aggregated with UNBOUNDED PRECEDING and CURRENT ROW as the default when not specified. SQL Server defaults to the less well performing RANGE option rather than ROWS.

ORDER BY定义了在未指定时使用*前行和当前行作为默认值聚合的窗口。SQL Server默认使用性能较差的范围选项,而不是行。

They have different semantics in the case of ties in that the window for the RANGE version includes not just the current row (and preceding rows) but also any additional tied rows with the same value of a as the current row. This can be seen in the number of rows counted by each in the results below.

它们在绑定的情况下有不同的语义,因为范围版本的窗口不仅包括当前行(和前面的行),还包括与当前行值相同的任何附加绑定行。这可以从下面的结果中的行数中看出。

SELECT  a, 
        b,
        COUNT(*) OVER (ORDER BY a 
                         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS  [Rows],
        COUNT(*) OVER (ORDER BY a 
                         RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS [Range],
        COUNT(*) OVER() AS [Over()]
    FROM    t;

Returns

返回

a        b        Rows        Range       Over()
-------- -------- ----------- ----------- -----------
NULL     NULL     1           4           12
NULL     NULL     2           4           12
NULL     NULL     3           4           12
NULL     NULL     4           4           12
a        b        5           7           12
a        b        6           7           12
a        b        7           7           12
c        d        8           11          12
c        d        9           11          12
c        d        10          11          12
c        d        11          11          12
e        NULL     12          12          12

To achieve the result that you were expecting to get omit both the PARTITION BY and ORDER BY and use an empty OVER() clause (also shown above).

为了实现预期的结果,您将省略分区BY和ORDER BY,并使用empty OVER()子句(也如上面所示)。