I'm trying to learn SQL so be patient with me. I'm using PostgreSQL 9.3
I want to average a column based on a window of dates. I've been able to write window functions that accomplish this with a set interval
but I'd like to be able to be able to do this with a growing interval
. By this I mean:
average values from date_0 to date_1
average values from date_0 to date_2
average values from date_0 to date_3
..... so date date_0 stays the same and date_x grows and creates a larger sample
I'm assuming there is a better way than running a query for each range I'd like to average. Any advice is appreciated. Thank you.
EDIT
I'm trying to create evenly spaced bins to be used to aggregate values of a table over.
I come to the interval by:
(MAX(date) - MIN(date)) / bins
where date
is the column of a table
and bins
is the number of bins I'd like to divide the table into.
date_0
= MIN(date)date_n
= MIN(date) + (interval * n)
I suggest the handy function width_bucket()
:
To get the average for each time segment ("bin"):
SELECT width_bucket(extract(epoch FROM t.the_date)
, x.min_epoch, x.max_epoch, x.bins) AS bin
, avg(value) AS bin_avg
FROM tbl t
, (SELECT extract(epoch FROM min(the_date)) AS min_epoch
, extract(epoch FROM max(the_date)) AS max_epoch
, 10 AS bins
FROM tbl t
) x
GROUP BY 1;
To get the "running average" over the (step-by-step) growing time interval:
SELECT bin, round(sum(bin_sum) OVER w /sum(bin_ct) OVER w, 2) AS running_avg
FROM (
SELECT width_bucket(extract(epoch FROM t.the_date)
, x.min_epoch, x.max_epoch, x.bins) AS bin
, sum(value) AS bin_sum
, count(*) AS bin_ct
FROM tbl t
, (SELECT extract(epoch FROM min(the_date)) AS min_epoch
, extract(epoch FROM max(the_date)) AS max_epoch
, 10 AS bins
FROM tbl t
) x
GROUP BY 1
) sub
WINDOW w AS (ORDER BY bin)
ORDER BY 1;
Using the_date
instead of date
as column name, avoiding reserved words as identifiers.
Since width_bucket()
is currently only implemented for double precision
and numeric
, I extract epoch values from the_date
. Details here:
Aggregating (x,y) coordinate point clouds in PostgreSQL
If you have a set of data, you can readily get what you want in separate columns:
select avg(case when date between date_0 and date_1 then value end) as avg1,
avg(case when date between date_0 and date_2 then value end) as avg2,
. . .
avg(case when date between date_0 and date_n then value end) as avgn
from table t
where date >= date_0;
来源:https://stackoverflow.com/questions/24513618/multiple-averages-over-evenly-spaced-intervals