Optimal performing query for latest record for each N

依然范特西╮ 提交于 2019-11-30 08:38:06

Depends on your data (how many rows are there per group?) and your indexes.

See Optimizing TOP N Per Group Queries for some performance comparisons of 3 approaches.

In your case with millions of rows for only a small number of Vehicles I would add an index on VehicleID, Timestamp and do

SELECT CA.*
FROM   Vehicles V
       CROSS APPLY (SELECT TOP 1 *
                    FROM   ChannelValue CV
                    WHERE  CV.VehicleID = V.VehicleID
                    ORDER  BY TimeStamp DESC) CA  

If your records are inserted sequentially, replacing TimeStamp in your query with ID may make a difference.

As a side note, how many records is this returning? Your delay could be network overhead if you are getting hundreds of thousands of rows back.

Try this:

SELECT SequencedChannelValue.* -- Specify only the columns you need, exclude the SequencedChannelValue
FROM
    (
        SELECT 
            ChannelValue.*,   -- Specify only the columns you need
            SeqValue = ROW_NUMBER() OVER(PARTITION BY VehicleID ORDER BY TimeStamp DESC)
        FROM ChannelValue
    ) AS SequencedChannelValue
WHERE SequencedChannelValue.SeqValue = 1

A table or index scan is expected, because you're not filtering data in any way. You're asking for the latest TimeStamp for all VehicleIDs - the query engine HAS to look at every row to find the latest TimeStamp.

You can help it out by narrowing the number of columns being returned (don't use SELECT *), and by providing an index that consists of VehicleID + TimeStamp.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!