问题
I have table event:
event_date,
num_events,
site_id
I can easily use aggregate SQL to do SELECT SUM(num_events) GROUP BY site_id.
But I also have another table site:
site_id,
target_date
I'd like to do a JOIN, showing the SUM of num_events within 60 days of the target_date, 90 days, 120 days, etc. I thought this could easily be done using a WHERE clause in the aggregate SQL. However, this is complicated by two challenges:
- The
target_dateis not fixed, but varies for eachsite_id - I'd like multiple date ranges to be output in the same table; so I can't do a simple
WHEREto exclude records falling outside the range from theeventtable.
One workaround I've thought of is to simply make several queries, one for each date range, and then use a view to paste them together. Is there a simpler, better, or more elegant way to achieve my goals?
回答1:
You would do something like:
select sum(case when target_date - event_date < 30 then 1 else 0 end) as within_030,
sum(case when target_date - event_date < 60 then 1 else 0 end) as within_060,
sum(case when target_date - event_date < 90 then 1 else 0 end) as within_090
from event e join
site s
on e.site_id = s.site_id;
That is, you can use conditional aggregation. I am not sure what "within 60" days means. This gives days before the target date, but similar logic will work for what you need.
回答2:
In Postgres 9.4 use the new aggregate FILTER clause:
Assuming actual date data type, so we can simply add / subtracts integer numbers for days.
Interpreting "within n days" as "+/- n days":
SELECT site_id, s.target_date
, sum(e.num_events) FILTER (WHERE e.event_date BETWEEN s.target_date - 30
AND s.target_date + 30) AS sum_30
, sum(e.num_events) FILTER (WHERE e.event_date BETWEEN s.target_date - 60
AND s.target_date + 60) AS sum_60
, sum(e.num_events) FILTER (WHERE e.event_date BETWEEN s.target_date - 90
AND s.target_date + 90) AS sum_90
FROM site s
JOIN event e USING (site_id)
WHERE e.event_date BETWEEN s.target_date - 90
AND s.target_date + 90
GROUP BY 1, 2;
Also add the condition as WHERE clause to exclude irrelevant rows early. This should be substantially faster with more than a trivial number of rows outside the scope of sum_90 in event.
来源:https://stackoverflow.com/questions/29062193/calculate-the-sum-of-a-field-filtered-by-a-window-defined-on-another-field