问题
I have a table EVENTS
USER EVENT_TS EVENT_TYPE
abc 2016-01-01 08:00:00 Login
abc 2016-01-01 08:25:00 Stuff
abc 2016-01-01 10:00:00 Stuff
abc 2016-01-01 14:00:00 Login
xyz 2015-12-31 18:00:00 Login
xyz 2016-01-01 08:00:00 Logout
What I need to do is produce a session field for each period of activity for each user. In addition, if the user has been idle for a period equal to or longer than p_timeout (1 hour in this case) then a new session starts at the next activity. Users don't always log out cleanly, so the logout isn't walways there...
Notes:
Logout always terminates a session
There doesn't have to be a logout or a login (because software)
Login is always a new session
Output like
USER EVENT_TS EVENT_TYPE SESSION
abc 2016-01-01 08:00:00 Login 1
abc 2016-01-01 08:25:00 Stuff 1
abc 2016-01-01 10:00:00 Stuff 2
abc 2016-01-01 14:00:00 Login 3
xyz 2015-12-31 18:00:00 Login 1
xyz 2016-01-01 08:00:00 Logout 1
Any thoughts on how to acheive this?
回答1:
I think this may do what you need. I changed "user" to "usr" in the input, and "session" to "sess" in the output - I don't ever use reserved Oracle words for object names.
Note: as Boneist pointed out below, my solution will assign a session number of 0 to the first session, if it is a Logout event (or a succession of Logouts right at the top). If this situation can occur in the data, and if the desired behavior is to start session counts at 1 even in that case, then the definition of flag must be tweaked - for example, by making flag = 1 when lag(event_ts) over (partition by usr order by event_ts) is null as well.
Good luck!
with
events ( usr, event_ts, event_type ) as (
select 'abc', to_timestamp('2016-01-01 08:00:00', 'yyyy-mm-dd hh24:mi:ss'), 'Login' from dual union all
select 'abc', to_timestamp('2016-01-01 08:25:00', 'yyyy-mm-dd hh24:mi:ss'), 'Stuff' from dual union all
select 'abc', to_timestamp('2016-01-01 10:00:00', 'yyyy-mm-dd hh24:mi:ss'), 'Stuff' from dual union all
select 'abc', to_timestamp('2016-01-01 14:00:00', 'yyyy-mm-dd hh24:mi:ss'), 'Login' from dual union all
select 'xyz', to_timestamp('2015-12-31 18:00:00', 'yyyy-mm-dd hh24:mi:ss'), 'Login' from dual union all
select 'xyz', to_timestamp('2016-01-01 08:00:00', 'yyyy-mm-dd hh24:mi:ss'), 'Logout' from dual
),
start_of_sess ( usr, event_ts, event_type, flag ) as (
select usr, event_ts, event_type,
case when event_type != 'Logout'
and
( event_ts >= lag(event_ts) over (partition by usr
order by event_ts) + 1/24
or event_type = 'Login'
or lag(event_type) over (partition by usr
order by event_ts) = 'Logout'
)
then 1 end
from events
)
select usr, event_ts, event_type,
count(flag) over (partition by usr order by event_ts) as sess
from start_of_sess
;
Output (timestamps use my current NLS_TIMESTAMP_FORMAT setting):
USR EVENT_TS EVENT_TYPE SESS
--- --------------------------------- ---------- ------
abc 01-JAN-2016 08.00.00.000000000 AM Login 1
abc 01-JAN-2016 08.25.00.000000000 AM Stuff 1
abc 01-JAN-2016 10.00.00.000000000 AM Stuff 2
abc 01-JAN-2016 02.00.00.000000000 PM Login 3
xyz 31-DEC-2015 06.00.00.000000000 PM Login 1
xyz 01-JAN-2016 08.00.00.000000000 AM Logout 1
6 rows selected
回答2:
I think this will do the trick:
WITH EVENTS AS (SELECT 'abc' usr, to_date('2016-01-01 08:00:00', 'yyyy-mm-dd hh24:mi:ss') event_ts, 'login' event_type FROM dual UNION ALL
SELECT 'abc' usr, to_date('2016-01-01 08:25:00', 'yyyy-mm-dd hh24:mi:ss') event_ts, 'Stuff' event_type FROM dual UNION ALL
SELECT 'abc' usr, to_date('2016-01-01 10:00:00', 'yyyy-mm-dd hh24:mi:ss') event_ts, 'Stuff' event_type FROM dual UNION ALL
SELECT 'abc' usr, to_date('2016-01-01 14:00:00', 'yyyy-mm-dd hh24:mi:ss') event_ts, 'login' event_type FROM dual UNION ALL
SELECT 'xyz' usr, to_date('2015-12-31 18:00:00', 'yyyy-mm-dd hh24:mi:ss') event_ts, 'login' event_type FROM dual UNION ALL
SELECT 'xyz' usr, to_date('2016-01-01 08:00:00', 'yyyy-mm-dd hh24:mi:ss') event_ts, 'Logout' event_type FROM dual UNION ALL
SELECT 'def' usr, to_date('2016-01-01 08:00:00', 'yyyy-mm-dd hh24:mi:ss') event_ts, 'Logout' event_type FROM dual UNION ALL
SELECT 'def' usr, to_date('2016-01-01 08:15:00', 'yyyy-mm-dd hh24:mi:ss') event_ts, 'Logout' event_type FROM dual)
SELECT usr,
event_ts,
event_type,
SUM(counter) OVER (PARTITION BY usr ORDER BY event_ts) session_id
FROM (SELECT usr,
event_ts,
event_type,
CASE WHEN LAG(event_type, 1, 'Logout') OVER (PARTITION BY usr ORDER BY event_ts) = 'Logout' THEN 1
WHEN event_type = 'Logout' THEN 0
WHEN event_ts - LAG(event_ts) OVER (PARTITION BY usr ORDER BY event_ts) > 1/24 THEN 1
WHEN event_type = 'login' THEN 1
ELSE 0
END counter
FROM EVENTS);
USR EVENT_TS EVENT_TYPE SESSION_ID
--- ------------------- ---------- ----------
abc 2016-01-01 08:00:00 login 1
abc 2016-01-01 08:25:00 Stuff 1
abc 2016-01-01 10:00:00 Stuff 2
abc 2016-01-01 14:00:00 login 3
def 2016-01-01 08:00:00 Logout 1
def 2016-01-01 08:15:00 Logout 2
xyz 2015-12-31 18:00:00 login 1
xyz 2016-01-01 08:00:00 Logout 1
This solution relies on the logic-short circuiting that takes place in the CASE expression and the fact that the event_type is not null. It also assumes that multiple logouts in a row are counted as separate sessions:
- If the previous row was a logout row (and if there is no previous row - i.e. for the first row in the set - treat it as if a logout row was present), we want to increase the counter by one. (Logouts terminate the session, so we always have a new session following a logout.)
- If the current row is a logout, then this terminates the existing session. Therefore, the counter shouldn't be increased.
- If the time of the current row is greater than an hour from the previous row, increase the counter by one.
- If the current row is a login row, then it's a new session, so increase the counter by one.
- For any other case, we don't increase the counter.
Once we've done that, it's just a matter of doing a running total on the counter.
回答3:
For completeness (for users with Oracle 12 or above), here is a solution using MATCH_RECOGNIZE:
select usr, event_ts, event_type, sess
from events
match_recognize(
partition by usr
order by event_ts
measures match_number() as sess
all rows per match
pattern (strt follow*)
define follow as event_type = 'Logout'
or ( event_type != 'Login'
and prev(event_type) != 'Logout'
and event_ts < prev(event_ts) + 1/24
)
)
;
Here I cover an unusual case: a Logout event following another Logout event. In such cases, I assume all consecutive Logouts, no matter how many and how far apart in time, belong to the same session. (If such cases are guaranteed not to occur in the data, so much the better.)
Please see also the Note I added to my other answer (for Oracle 11 and below) regarding the possibility of the very first event for a usr being a Logout (if that is even possible in the input data).
来源:https://stackoverflow.com/questions/40133743/create-id-column-based-on-activity-data