问题
We have a table with the following columns:
SESSION_ID USER_ID CONNECT_TS
-------------- --------------- ---------------
1 99 2013-01-01 2:23:33
2 101 2013-01-01 2:23:55
3 104 2013-01-01 2:24:41
4 101 2013-01-01 2:24:43
5 233 2013-01-01 2:25:01
We need to get a distinct count of users for each day and a count of "active users" which are defined as users that have used the application in the last 45 days. Here is what we have come up with, but I feel like there has to be a better way:
select trunc(a.connect_ts)
, count(distinct a.user_id) daily_users
, count(distinct b.user_id) active_users
from sessions a
join sessions b
on (b.connect_ts between trunc(a.connect_ts) - 45 and trunc(a.connect_ts))
where a.connect_ts between '01-jan-13' and '12-jun-13'
and b.connect_ts between '01-nov-12' and '12-jun-13'
group by trunc(a.connect_ts);
We looked at window functions, but it doesn't look like distinct counts are supported. We also considered loading aggregates into a temp table first but, again, the distinct counts ruled it out. Is there a better way to be doing this?
回答1:
The first thing to do is generate a list of the days you're interesting in:
select (trunc(sysdate, 'yyyy') -1) + level as ts_day
from dual
connect by level <= to_number( to_char(sysdate, 'DDD' ) )
This will generate a table of dates from 01-JAN this year to today. Join your table to this sub-query. Using a cross join might not be particularly efficient, depending on how much data you have in the range. So please regard this as a proof of concept and tune as you need.
with days as
( select (trunc(sysdate, 'yyyy') -1) + level as ts_day
from dual
connect by level <= to_number( to_char(sysdate, 'DDD' ) ) )
select days.ts_day
, sum ( case when trunc(connect_ts) = ts_day then 1 else 0 end ) as daily_users
, sum ( case when trunc(connect_ts) between ts_day - 45 and ts_day then 1 else 0 end ) as active_users
from days
cross join sessions
where connect_ts between trunc(sysdate, 'yyyy') - 45 and sysdate
group by ts_day
order by ts_day
/
回答2:
If Your version of Oracle supports WITH-statements, this might help You:
with sel as (
select trunc(a.connect_ts) as logon_day
, count(distinct user_id) as logon_count
from sessions
group by trunc(connect_ts)
)
select s1.logon_day
, s1.logon_count as daily_users
, (select sum(logon_count) from sel where logon_day between s1.logon_day - 45 and s1.logon_day) as active_users
from sel s1
otherwise You'll have to write it this way (which executes much slower...):
select sel.logon_day
, sel.logon_count as daily_users
, (select count(distinct user_id) as logon_count
from t_ad_session
where trunc(connect_ts) between sel.logon_day - 45 and sel.logon_day) as active_users
from (select trunc(connect_ts) as logon_day, count(distinct user_id) as logon_count
from t_ad_session
group by trunc(connect_ts)) sel
来源:https://stackoverflow.com/questions/17099743/rolling-daily-distinct-counts