问题
I was using this SQL statement:
SELECT "dateId", "userId", "Salary"
FROM (
SELECT *,
(row_number() OVER (ORDER BY "userId", "dateId"))%2 AS rn
FROM user_table
) sa
WHERE sa.rn=1
AND "userId" = 789
AND "Salary" > 0;
But every time the table gets new rows the result of the query is different.
Am I missing something?
回答1:
Assuming that ("dateId", "userId")
is unique and new rows always have a bigger (later) dateId
.
After some comments:
What I think you need:
SELECT "dateId", "userId", "Salary"
FROM (
SELECT "dateId", "userId", "Salary"
,(row_number() OVER (PARTITION BY "userId" -- either this
ORDER BY "dateId")) % 2 AS rn
FROM user_table
WHERE "userId" = 789 -- ... or that
) sub
WHERE sub.rn = 1
AND "Salary" > 0;
Notice the PARTITION BY. This way you skip every second dateId
for each userId
, and additional (later) rows don't change the selection so far.
Also, as long as you are selecting rows for a single userId
(WHERE "userId" = 789
), pull the predicate into the subquery, achieving the same effect (stable selection for a single user). You don't need both.
The WHERE
clause in the subquery only works for a single user, PARTITION BY
works for any number of users in one query.
Is that it? Is it?
They should give me "detective" badge for this.
Seriously.
回答2:
No that seems to be OK. You have new rows, those rows change the old rows to appear on different position after sorting.
回答3:
If someone insert a new row with a userId below 789 the order will change. For example, if you have:
userId rn
1 1
4 0
5 1
6 0
and you insert a row with userId = 2, the rn will change:
userId rn
1 1
2 0
4 1
5 0
6 1
In order to select every Nth row you need a column with a sequence or a timestamp.
来源:https://stackoverflow.com/questions/7518788/selecting-every-nth-row-per-user-in-postgres