Custom aggregate function in PostgreSQL

问题

Is it possible to write an aggregate function in PostgreSQL that will calculate a delta value, by substracting the initial (last value in the column) from the current(first value in column) ? It would apply on a structure like this

rankings (userId, rank, timestamp)

And could be used like

SELECT userId, custum_agg(rank) OVER w 
FROM rankings
WINDOWS w AS (PARTITION BY userId ORDER BY timstamp desc)

returning for an userId the rank of the newest entry (by timestamp) - rank of the oldest entry (by timestamp)

Thanks!

回答1:

the rank of the newest entry (by timestamp) - rank of the oldest entry (by timestamp)

There are many ways to achieve this with existing functions. You can use the existing window functions first_value() and last_value(), combined with DISTINCT or DISTINCT ON to get it without joins and subqueries:

SELECT DISTINCT ON (userid)
       userid
     , last_value(rank) OVER w  
     - first_value(rank) OVER w AS rank_delta
FROM   rankings
WINDOW w AS (PARTITION BY userid ORDER BY ts
             ROWS BETWEEN UNBOUNDED PRECEDING
             AND  UNBOUNDED FOLLOWING);

Note the custom frames for the window functions!

Or you can use basic aggregate functions in a subquery and JOIN:

SELECT userid, r2.rank - r1.rank AS rank_delta
FROM  (
  SELECT userid
       , min(ts) AS first_ts
       , max(ts) AS last_ts
   FROM  rankings
   GROUP BY 1
   ) sub
JOIN   rankings r1 USING (userid)
JOIN   rankings r2 USING (userid)
WHERE  r1.ts = first_ts
AND    r2.ts = last_ts;

Assuming unique (userid, rank), or your requirements would be ambiguous.

SQL Fiddle demo.

Shichinin no samurai

^{... a.k.a. "7 Samurai"}
Per request in the comments, the same for only the last seven rows per userid (or as many as can be found, if there are fewer):

Again, one of many possible ways. But I believe this to be one of the shortest:

SELECT DISTINCT ON (userid)
       userid
     , first_value(rank) OVER w  
     - last_value(rank)  OVER w AS rank_delta
FROM   rankings
WINDOW w AS (PARTITION BY userid ORDER BY ts DESC
             ROWS BETWEEN CURRENT ROW AND 7 FOLLOWING)
ORDER  BY userid, ts DESC;

Note the reversed sort order. The first row is the "newest" entry. I span a frame of (max.) 7 rows and pick only the results for the newest entry with DISTINCT ON.

SQL Fiddle demo.

回答2:

You can do it with JOIN and DISTINCT ON in Postgres. The GRP query give you the last rank values for each userID so just join it with rankings on user_id and substract values.

SELECT rankings.userId, 
       rankings.rank-GRP.rank as delta,
       rankings.timestamp
FROM rankings
JOIN
(
    SELECT DISTINCT ON (userId)  userId, rank, timestamp
    FROM rankings
    ORDER BY userId, timestamp DESC
) as GRP ON rankings.userId=GRP.userId

SQLFiddle demo

来源：https://stackoverflow.com/questions/22063169/custom-aggregate-function-in-postgresql

标签

sql

database

postgresql

aggregate-functions