TSQL OVER clause: COUNT(*) OVER (ORDER BY a)

≯℡__Kan透↙ 提交于 2019-12-03 16:58:44

问题


This is my code:

USE [tempdb];
GO

IF OBJECT_ID(N'dbo.t') IS NOT NULL
BEGIN
    DROP TABLE dbo.t
END
GO

CREATE TABLE dbo.t
(
    a NVARCHAR(8),
    b NVARCHAR(8)
);
GO

INSERT t VALUES ('a', 'b');
INSERT t VALUES ('a', 'b');
INSERT t VALUES ('a', 'b');
INSERT t VALUES ('c', 'd');
INSERT t VALUES ('c', 'd');
INSERT t VALUES ('c', 'd');
INSERT t VALUES ('c', 'd');
INSERT t VALUES ('e', NULL);
INSERT t VALUES (NULL, NULL);
INSERT t VALUES (NULL, NULL);
INSERT t VALUES (NULL, NULL);
INSERT t VALUES (NULL, NULL);
GO

SELECT  a, b,
    COUNT(*) OVER (ORDER BY a)
FROM    t;

On this page of BOL, Microsoft says that:

If PARTITION BY is not specified, the function treats all rows of the query result set as a single group.

So based on my understanding, the last SELECT statement will give me the following result. Since all records are considered as in one single group, right?

a        b        
-------- -------- -----------
NULL     NULL     12
NULL     NULL     12
NULL     NULL     12
NULL     NULL     12
a        b        12
a        b        12
a        b        12
c        d        12
c        d        12
c        d        12
c        d        12
e        NULL     12

But the actual result is:

a        b        
-------- -------- -----------
NULL     NULL     4
NULL     NULL     4
NULL     NULL     4
NULL     NULL     4
a        b        7
a        b        7
a        b        7
c        d        11
c        d        11
c        d        11
c        d        11
e        NULL     12

Anyone can help to explain why? Thanks.


回答1:


It gives a running total (this functionality was not implemented in SQL Server until version 2012.)

The ORDER BY defines the window to be aggregated with UNBOUNDED PRECEDING and CURRENT ROW as the default when not specified. SQL Server defaults to the less well performing RANGE option rather than ROWS.

They have different semantics in the case of ties in that the window for the RANGE version includes not just the current row (and preceding rows) but also any additional tied rows with the same value of a as the current row. This can be seen in the number of rows counted by each in the results below.

SELECT  a, 
        b,
        COUNT(*) OVER (ORDER BY a 
                         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS  [Rows],
        COUNT(*) OVER (ORDER BY a 
                         RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS [Range],
        COUNT(*) OVER() AS [Over()]
    FROM    t;

Returns

a        b        Rows        Range       Over()
-------- -------- ----------- ----------- -----------
NULL     NULL     1           4           12
NULL     NULL     2           4           12
NULL     NULL     3           4           12
NULL     NULL     4           4           12
a        b        5           7           12
a        b        6           7           12
a        b        7           7           12
c        d        8           11          12
c        d        9           11          12
c        d        10          11          12
c        d        11          11          12
e        NULL     12          12          12

To achieve the result that you were expecting to get omit both the PARTITION BY and ORDER BY and use an empty OVER() clause (also shown above).



来源:https://stackoverflow.com/questions/14860162/tsql-over-clause-count-over-order-by-a

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!