Variant use of the GROUP BY clause in TSQL

痞子三分冷 提交于 2019-12-12 12:52:18

问题


Imagine the following schema and sample data (SQL Server 2008):

OriginatingObject
----------------------------------------------
ID
1
2
3

ValueSet
----------------------------------------------
ID   OriginatingObjectID   DateStamp
1    1                     2009-05-21 10:41:43
2    1                     2009-05-22 12:11:51
3    1                     2009-05-22 12:13:25
4    2                     2009-05-21 10:42:40
5    2                     2009-05-20 02:21:34
6    1                     2009-05-21 23:41:43
7    3                     2009-05-26 14:56:01

Value
----------------------------------------------
ID   ValueSetID   Value
1    1            28
etc (a set of rows for each related ValueSet)

I need to obtain the ID of the most recent ValueSet record for each OriginatingObject. Do not assume that the higher the ID of a record, the more recent it is.

I am not sure how to use GROUP BY properly in order to make sure the set of results grouped together to form each aggregate row includes the ID of the row with the highest DateStamp value for that grouping. Do I need to use a subquery or is there a better way?


回答1:


You can do it with a correlated subquery or using IN with multiple columns and a GROUP-BY.

Please note, simple GROUP-BY can only bring you to the list of OriginatingIDs and Timestamps. In order to pull the relevant ValueSet IDs, the cleanest solution is use a subquery.

Multiple-column IN with GROUP-BY (probably faster):

SELECT O.ID, V.ID
FROM Originating AS O, ValueSet AS V
WHERE O.ID = V.OriginatingID
AND
(V.OriginatingID, V.DateStamp) IN
(
    SELECT OriginatingID, Max(DateStamp)
    FROM ValueSet
    GROUP BY OriginatingID
)

Correlated Subquery:

SELECT O.ID, V.ID
FROM Originating AS O, ValueSet AS V
WHERE O.ID = V.OriginatingID
AND
V.DateStamp =
(
    SELECT Max(DateStamp)
    FROM ValueSet V2
    WHERE V2.OriginatingID = O.ID
)



回答2:


SELECT OriginatingObjectID, id
FROM (
    SELECT id, OriginatingObjectID, RANK() OVER(PARTITION BY OriginatingObjectID 
                                    ORDER BY DateStamp DESC) as ranking
    FROM ValueSet)
WHERE ranking = 1;



回答3:


This can be done with a correlated sub-query. No GROUP-BY necessary.

SELECT
  vs.ID,
  vs.OriginatingObjectID,
  vs.DateStamp,
  v.Value
FROM
  ValueSet vs
  INNER JOIN Value v ON v.ValueSetID = vs.ID 
WHERE
  NOT EXISTS (
    SELECT 1 
    FROM   ValueSet 
    WHERE  OriginatingObjectID = vs.OriginatingObjectID 
           AND DateStamp > vs.DateStamp
  )

This works only if there can not be two equal DateStamps for a OriginatingObjectID in the ValueSet table.



来源:https://stackoverflow.com/questions/1187644/variant-use-of-the-group-by-clause-in-tsql

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!