Sql Query Pervious Row Optimisation

主宰稳场 提交于 2019-11-28 01:36:13
Michael Pakhantsov
SELECT t1.FileName, t1.CreatedDate, t2.CreatedDate as PrevCreatedDate
FROM 
   (SELECT FileName, CreateDate,
          ROW_NUMBER() OVER(PARTITION BY FileName ORDER BY CreatedDate) AS OrderNo
   FROM MyTable) t1
LEFT JOIN
   (SELECT FileName, CreateDate,
     ROW_NUMBER() OVER(PARTITION BY FileName ORDER BY CreatedDate) AS OrderNo
     FROM MyTable) t2
ON (t1.FileName = t2.FileName AND t1.OrderNo = t2.OrderNo - 1)

Or may be better use 'WITH', because queries is identical:

WITH t(ObjectID, FileName, CreatedDate, OrderNo) AS
   (SELECT ObjectID, FileName, CreatedDate,
          ROW_NUMBER() OVER(PARTITION BY FileName ORDER BY CreatedDate) AS OrderNo
   FROM MyTable) 
SELECT t1.ObjectID, t1.FileName, t1.CreatedDate, t2.CreatedDate AS PrevCreatedDate,
        DATEDIFF("SS", '1900-01-01 00:00:00', 
           COALESCE((t1.CreatedDate - t2.CreatedDate),0)) AS secondsTaken
FROM t t1 LEFT JOIN t t2 
ON (t1.FileName = t2.FileName AND t1.OrderNo = t2.OrderNo + 1)
Martin Smith

I think Michael's answer should indeed prove more efficient. When evaluating efficiency though I just want to draw attention to an issue with the query costs (relative to the batch) shown in Management Studio.

I set up a test table with 23,174 rows and ran the query in the question and Michael's. Looking at the "query cost (relative to the batch)" in the actual execution plan the original query was 1% and Michael's 99% cost and so appears to be massively inefficient.

However the actual stats tell a completely different story

Cross Apply Approach

SQL Server Execution Times: CPU time = 0 ms, elapsed time = 0 ms.
Table 'MyTable'. Scan count 23175, logical reads 49335, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

ROW_NUMBER approach

SQL Server Execution Times: CPU time = 391 ms, elapsed time = 417 ms.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'MyTable'. Scan count 2, logical reads 148, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

In the ROW_NUMBER plan the Merge Join on rownumber=rownumber+1 has 23,174 rows going in to both sides. This value is unique and actual rows out is 23,174 as well. However SQL Server estimates that the rows produced from that join will be 34,812,000 and thus its estimated cost for the insert later in the plan is wildly inaccurate.

Test Script

BEGIN TRAN

CREATE TABLE MyTable
  (
     [ObjectID]    [INT] IDENTITY(1, 1) NOT NULL PRIMARY KEY CLUSTERED,
     [FileName]    [VARCHAR](50) NULL,
     [CreatedDate] [DATETIME] NULL
  )

GO

INSERT INTO MyTable
SELECT ISNULL(type, NEWID()),
       DATEADD(DAY, CAST(RAND(CAST(NEWID() AS VARBINARY)) * 10000 AS INT), GETDATE())
FROM   master.dbo.spt_values,
       (SELECT TOP 10 1 AS X FROM  master.dbo.spt_values) V


DELETE FROM MyTable
WHERE  EXISTS(SELECT *
              FROM   MyTable m2
              WHERE  MyTable.CreatedDate = m2.CreatedDate
                     AND MyTable.FileName = m2.FileName
                     AND MyTable.ObjectID < m2.ObjectID)

CREATE UNIQUE NONCLUSTERED INDEX [IX_MyTable]
  ON MyTable ([FileName] ASC, [CreatedDate] ASC)

SET STATISTICS IO ON
SET STATISTICS TIME ON

SELECT A.ObjectID,
       A.FileName,
       A.CreatedDate                                                                                AS CreatedDate,
       B.PrevRowCreatedDate,
       DATEDIFF("SS", '1900-01-01 00:00:00', COALESCE(( A.CreatedDate - B.PrevRowCreatedDate ), 0)) AS secondsTaken
INTO   #A
FROM   MyTable AS A
       CROSS APPLY ((SELECT PrevRowCreatedDate = MAX(CreatedDate)
                     FROM   MyTable AS BB
                     WHERE  BB.FileName = A.FileName
                            AND BB.CreatedDate < A.CreatedDate)) AS B;

WITH t(ObjectID, FileName, CreatedDate, OrderNo)
     AS (SELECT ObjectID,
                FileName,
                CreatedDate,
                RANK() OVER(PARTITION BY FileName ORDER BY CreatedDate) AS OrderNo
         FROM   MyTable)
SELECT t1.ObjectID,
       t1.FileName,
       t1.CreatedDate,
       t2.CreatedDate                                                                          AS PrevCreatedDate,
       DATEDIFF("SS", '1900-01-01 00:00:00', COALESCE(( t1.CreatedDate - t2.CreatedDate ), 0)) AS secondsTaken
INTO   #B
FROM   t t1
       LEFT JOIN t t2
         ON ( t1.FileName = t2.FileName
              AND t1.OrderNo = t2.OrderNo + 1 )

/*Test the 2 queries give the same result*/
SELECT *
FROM   #A
EXCEPT
SELECT *
FROM   #B

SELECT *
FROM   #B
EXCEPT
SELECT *
FROM   #A

ROLLBACK 
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!