Selecting every Nth row per user in Postgres

问题

I was using this SQL statement:

SELECT "dateId", "userId", "Salary" 
FROM (
   SELECT *, 
          (row_number() OVER (ORDER BY "userId", "dateId"))%2 AS rn 
   FROM user_table
 ) sa 
 WHERE sa.rn=1 
   AND "userId" = 789 
   AND "Salary" > 0;

But every time the table gets new rows the result of the query is different.
Am I missing something?

回答1:

Assuming that ("dateId", "userId") is unique and new rows always have a bigger (later) dateId.

After some comments:

What I think you need:

SELECT "dateId", "userId", "Salary"
FROM (
   SELECT "dateId", "userId", "Salary"
         ,(row_number() OVER (PARTITION BY "userId"   -- either this
                              ORDER BY "dateId")) % 2 AS rn
   FROM   user_table
   WHERE  "userId" = 789                              -- ... or that
   ) sub
WHERE  sub.rn = 1
AND    "Salary" > 0;

Notice the PARTITION BY. This way you skip every second dateId for each userId, and additional (later) rows don't change the selection so far.

Also, as long as you are selecting rows for a single userId (WHERE "userId" = 789), pull the predicate into the subquery, achieving the same effect (stable selection for a single user). You don't need both.

The WHERE clause in the subquery only works for a single user, PARTITION BY works for any number of users in one query.

_{Is that it? Is it?
_{They should give me "detective" badge for this.
_Seriously.}}

回答2:

No that seems to be OK. You have new rows, those rows change the old rows to appear on different position after sorting.

回答3:

If someone insert a new row with a userId below 789 the order will change. For example, if you have:

and you insert a row with userId = 2, the rn will change:

In order to select every Nth row you need a column with a sequence or a timestamp.

来源：https://stackoverflow.com/questions/7518788/selecting-every-nth-row-per-user-in-postgres

标签

sql

postgresql

select

window-functions

row-number