问题
Is it possible to write an aggregate function in PostgreSQL that will calculate a delta value, by substracting the initial (last value in the column) from the current(first value in column) ? It would apply on a structure like this
rankings (userId, rank, timestamp)
And could be used like
SELECT userId, custum_agg(rank) OVER w
FROM rankings
WINDOWS w AS (PARTITION BY userId ORDER BY timstamp desc)
returning for an userId the rank of the newest entry (by timestamp) - rank of the oldest entry (by timestamp)
Thanks!
回答1:
the rank of the newest entry (by timestamp) - rank of the oldest entry (by timestamp)
There are many ways to achieve this with existing functions.
You can use the existing window functions first_value() and last_value(), combined with DISTINCT
or DISTINCT ON
to get it without joins and subqueries:
SELECT DISTINCT ON (userid)
userid
, last_value(rank) OVER w
- first_value(rank) OVER w AS rank_delta
FROM rankings
WINDOW w AS (PARTITION BY userid ORDER BY ts
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING);
Note the custom frames for the window functions!
Or you can use basic aggregate functions in a subquery and JOIN:
SELECT userid, r2.rank - r1.rank AS rank_delta
FROM (
SELECT userid
, min(ts) AS first_ts
, max(ts) AS last_ts
FROM rankings
GROUP BY 1
) sub
JOIN rankings r1 USING (userid)
JOIN rankings r2 USING (userid)
WHERE r1.ts = first_ts
AND r2.ts = last_ts;
Assuming unique (userid, rank)
, or your requirements would be ambiguous.
SQL Fiddle demo.
Shichinin no samurai
... a.k.a. "7 Samurai"
Per request in the comments, the same for only the last seven rows per userid (or as many as can be found, if there are fewer):
Again, one of many possible ways. But I believe this to be one of the shortest:
SELECT DISTINCT ON (userid)
userid
, first_value(rank) OVER w
- last_value(rank) OVER w AS rank_delta
FROM rankings
WINDOW w AS (PARTITION BY userid ORDER BY ts DESC
ROWS BETWEEN CURRENT ROW AND 7 FOLLOWING)
ORDER BY userid, ts DESC;
Note the reversed sort order. The first row is the "newest" entry. I span a frame of (max.) 7 rows and pick only the results for the newest entry with DISTINCT ON
.
SQL Fiddle demo.
回答2:
You can do it with JOIN and DISTINCT ON in Postgres. The GRP
query give you the last rank
values for each userID
so just join it with rankings
on user_id
and substract values.
SELECT rankings.userId,
rankings.rank-GRP.rank as delta,
rankings.timestamp
FROM rankings
JOIN
(
SELECT DISTINCT ON (userId) userId, rank, timestamp
FROM rankings
ORDER BY userId, timestamp DESC
) as GRP ON rankings.userId=GRP.userId
SQLFiddle demo
来源:https://stackoverflow.com/questions/22063169/custom-aggregate-function-in-postgresql