问题
I am having a weird result when I am trying to get the LAST_VALUE from a table in SQL Server 2012.
This is the table I have
PK | Id1 | Id2
1 | 2 | 5
2 | 2 | 6
3 | 2 | 5
4 | 2 | 6
This is my query
SELECT
Id1, Id2, LAST_VALUE(PK) OVER (PARTITION BY Id1 ORDER BY Id2) AS LastValue
FROM
@Data
This is the result I am expecting
Id1 | Id2 | LastValue
2 | 5 | 3
2 | 5 | 3
2 | 6 | 4
2 | 6 | 4
This is what I am receiving
Id1 | Id2 | LastValue
2 | 5 | 3
2 | 5 | 3
2 | 6 | 2
2 | 6 | 2
Here is a demonstration of the problem
http://sqlfiddle.com/#!6/5c729/1
Is there anything wrong with my query?
回答1:
SQL Server doesn't know or care about the order in which rows were inserted into the table. If you need specific order, always use ORDER BY. In your example ORDER BY is ambiguous, unless you include PK into the ORDER BY. Besides, LAST_VALUE function can return odd results if you are not careful - see below.
You can get your expected result using MAX or LAST_VALUE (SQLFiddle). They are equivalent in this case:
SELECT
PK, Id1, Id2
,MAX(PK) OVER (PARTITION BY Id1, Id2) AS MaxValue
,LAST_VALUE(PK) OVER (PARTITION BY Id1, Id2 ORDER BY PK rows between unbounded preceding and unbounded following) AS LastValue
FROM
Data
ORDER BY id1, id2, PK
Result of this query will be the same regardless of the order in which rows were originally inserted into the table. You can try to put INSERT statements in different order in the fiddle. It doesn't affect the result.
Also, LAST_VALUE behaves not quite as you'd intuitively expect with default window (when you have just ORDER BY in the OVER clause). Default window is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, while you'd expected it to be ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. Here is a SO answer with a good explanation. The link to this SO answer is on MSDN page for LAST_VALUE. So, once the row window is specified explicitly in the query it returns what is needed.
If you want to know the order in which rows were inserted into the table, I think, the most simple way is to use IDENTITY. So, definition of your table would change to this:
CREATE TABLE Data
(PK INT IDENTITY(1,1) PRIMARY KEY,
Id1 INT,
Id2 INT)
When you INSERT into this table you don't need to specify the value for PK, the server would generate it automatically. It guarantees that generated values are unique and growing (with positive increment parameter), even if you have many clients inserting into the table at the same time simultaneously. There may be gaps between generated values, but the relative order of the generated values will tell you which row was inserted after which row.
回答2:
It is never a good idea to rely on implicit order caused by the particular implementation of the underlying database engine.
I don't know why, running the query
SELECT * FROM @Data ORDER BY Id2
the result will be
+----+-----+-----+
| PK | id1 | id2 |
+----+-----+-----+
| 1 | 2 | 5 |
| 3 | 2 | 5 |
| 4 | 2 | 6 |
| 2 | 2 | 6 |
+----+-----+-----+
which means SQL Server decided the order of rows in a way that is different from the insert order.
That's why the LAST_VALUE behavior is different from expected, but is consistent with the SQL Server sort method.
But how SQL Server sort your data?
The best answer we have is the accepted answer of this question (from where I took the sentence in the beginning of my answer).
回答3:
SELECT Id1
, Id2
, LAST_VALUE(PK)
OVER (PARTITION BY Id1
ORDER BY Id2 DESC) AS LastValue
FROM Data
ORDER BY Id2 ASC
Result
Id1 Id2 LastValue
2 5 3
2 5 3
2 6 4
2 6 4
来源:https://stackoverflow.com/questions/29971151/last-value-in-sql-server-2012-is-returning-weird-results