Group and count events per time intervals, plus running total

家住魔仙堡 提交于 2019-12-11 02:23:39

问题


I'm a fairly new Postgres user, I'm sure there's an answer to this already but I can't find it.
I need to analyze some data in an activity log table, grouping the the results by time period.

A simple version of the problem would be a table with three fields:

    Column    |           Type           |              Modifiers
--------------+--------------------------+-------------------------------------
 period_start | timestamp with time zone | not null
 user_id      | text                     | not null
 action       | text                     | not null

The action string I want to capture could be 'create_entry' (yes, I know that sucks as good db design but I'm stuck with it)

The output I'm looking for is a report showing count of 'create_entry' actions by year and month. Something like:

 Year | Month | Entries
------+-------+----------
 2013 |  12   | 14345
 2014 |   1   | 9876
 2014 |   2   | 10234

My instinct is to attack this with sub-queries but looking around there seems to be a suggestion that loops might be the way to go.

Either way, I'm out of my depth and looking for a push in the right direction.

EDIT

A supplementary question based on a request that came in while I was typing this.. A variation on the above that would show a cumulative total month by month (although I'd like to think I could figure that out by myself based on the approach to the above - I'm just adding this in case it's relevant to the approach).


回答1:


If I understand correctly, you just want to GROUP BY the years and months in your table, for every row that has an action of create_entry:

SELECT
  DATE_PART('YEAR', period_start) as Year,
  DATE_PART('MONTH', period_start) as Month,
  COUNT(*) as Entries
FROM activity_log
WHERE action = 'create_entry'
GROUP BY Year, Month;

SQL Fiddle




回答2:


I would use date_trunc() to truncate the timestamp to the month and cover both year and month in a single column. You can use to_char() to format that any way you like.

To get the running count you mention in your supplementary question, add a window function:

SELECT to_char(date_trunc('month', period_start), 'Mon YYYY') AS month
     , count(*) AS month_ct
     , sum(count(*)) OVER (ORDER BY date_trunc('month', period_start))
                                                              AS running_ct
FROM   activity_log
WHERE  action = 'create_entry'
GROUP  BY date_trunc('month', period_start);

-> SQLfiddle

  • Window functions are execute after aggregate function, so we can run a window function over an aggregate function in the same query level. Another example for this:
    Postgres window function and group by exception

  • It's essential that the window function and the GROUP BY clause repeat the basic expression we are grouping by: date_trunc('month', period_start).



来源:https://stackoverflow.com/questions/22055461/group-and-count-events-per-time-intervals-plus-running-total

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!