Cohort analysis in SQL

前端 未结 4 1653
慢半拍i
慢半拍i 2020-12-09 13:12

Looking to do some cohort analysis on a userbase. We have 2 tables \"users\" and \"sessions\", where users and sessions both have a \"created_at\" field. I\'m looking to f

4条回答
  •  孤街浪徒
    2020-12-09 13:59

    Example Of Month Wise Cohort:

    First Let's Create Table Individual User Activity Flow (MONTH WISE):

    SELECT 
        mu.created_timestamp AS cohort
        , mu.id AS user_id
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 1 AND l.user_id = mu.id) AS m1
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 2 AND l.user_id = mu.id) AS m2
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 3 AND l.user_id = mu.id) AS m3
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 4 AND l.user_id = mu.id) AS m4
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 5 AND l.user_id = mu.id) AS m5
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 6 AND l.user_id = mu.id) AS m6
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 7 AND l.user_id = mu.id) AS m7
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 8 AND l.user_id = mu.id) AS m8
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 9 AND l.user_id = mu.id) AS m9
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 10 AND l.user_id = mu.id) AS m10
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 11 AND l.user_id = mu.id) AS m11
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 12 AND l.user_id = mu.id) AS m12
    FROM user mu 
    WHERE mu.created_timestamp BETWEEN '2018-01-01 00:00:00' AND '2019-12-31 23:59:59'
    

    Then After This Table Calculate the individual activity-sum of the user:

    SELECT MONTH(c.cohort) AS cohort
           ,COUNT(c.user_id) AS signups
           ,SUM(c.m1) AS m1 
           ,SUM(c.m2) AS m2 
           ,SUM(c.m3) AS m3 
           ,SUM(c.m4) AS m4 
           ,SUM(c.m5) AS m5 
           ,SUM(c.m6) AS m6 
           ,SUM(c.m7) AS m7 
           ,SUM(c.m8) AS m8 
           ,SUM(c.m9) AS m9 
           ,SUM(c.m10) AS m10 
           ,SUM(c.m11) AS m11 
           ,SUM(c.m12) AS m12 
    FROM (SELECT 
        mu.created_timestamp AS cohort
        , mu.id AS user_id
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 1 AND l.user_id = mu.id) AS m1
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 2 AND l.user_id = mu.id) AS m2
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 3 AND l.user_id = mu.id) AS m3
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 4 AND l.user_id = mu.id) AS m4
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 5 AND l.user_id = mu.id) AS m5
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 6 AND l.user_id = mu.id) AS m6
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 7 AND l.user_id = mu.id) AS m7
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 8 AND l.user_id = mu.id) AS m8
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 9 AND l.user_id = mu.id) AS m9
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 10 AND l.user_id = mu.id) AS m10
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 11 AND l.user_id = mu.id) AS m11
        ,(SELECT IF(COUNT(l.order_date) = 0 , 0, 1) FROM order l WHERE MONTH(l.order_date) = 12 AND l.user_id = mu.id) AS m12
    FROM user mu 
    WHERE mu.created_timestamp BETWEEN '2018-01-01 00:00:00' AND '2019-12-31 23:59:59') AS c GROUP BY MONTH(cohort)
    

    In replacement of months you can use days, other wise cohort analysis mostly use in month cases

提交回复
热议问题