What percent of the time does a user login, immediately followed by sending a message?

微笑、不失礼 提交于 2020-04-16 05:48:25

问题


I have never queried for such a thing before and not sure how possible it is. Let's say I have the following table:

user_id   date              event
22      2012-05-02 11:02:39 login
22      2012-05-02 11:02:53 send_message
22      2012-05-02 11:03:28 logout
22      2012-05-02 11:04:09 login
22      2012-05-02 11:03:16 send_message
22      2012-05-02 11:03:43 search_run

How can I calculate the percent of time a user logs in and within 2 minutes sends a message?


回答1:


For a given user:

SELECT round(count(*) FILTER (WHERE sent_in_time) * 100.0 / count(*), 2) AS pct_sent_in_time
FROM  (
   SELECT (min(date) FILTER (WHERE event = 'send_message')
         - min(date)) < interval '2 min' AS sent_in_time
   FROM  (
      SELECT date, event
           , count(*) FILTER (WHERE event = 'login')
                      OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS grp
      FROM   tbl
      WHERE  user_id = 22  -- given user
      ) sub1
   GROUP  BY grp
   ) sub2;
| pct_sent_in_time |
| ---------------: |
|            50.00 |

For all users:

SELECT user_id
     , round(count(*) FILTER (WHERE sent_in_time) * 100.0 / count(*), 2) AS pct_sent_in_time
FROM  (
   SELECT user_id
        , (min(date) FILTER (WHERE event = 'send_message')
         - min(date)) < interval '2 min' AS sent_in_time
   FROM  (
      SELECT user_id, date, event
           , count(*) FILTER (WHERE event = 'login')
                      OVER (PARTITION BY user_id ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS grp
      FROM   tbl
      ) sub1
   GROUP  BY user_id, grp
   ) sub2
GROUP  BY user_id;
user_id | pct_sent_in_time
------: | ---------------:
     22 |            33.33
     23 |           100.00

I extended the test case to make it more revealing, hence a different percentage. See:
db<>fiddle here

Partition data after every new login, and check whether 'send_message' happens within less than 2 minutes. Then calculate percentage and round.

Notably, this is not fooled by many logins in quick succession, followed my a login with a message in under 2 minutes.

Related:

  • Aggregate values over a range of hours, every hour

Aside: The name "date" for a timestamp column is quite misleading.



来源:https://stackoverflow.com/questions/61032939/what-percent-of-the-time-does-a-user-login-immediately-followed-by-sending-a-me

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!