MySQL: group by consecutive days and count groups

前端 未结 5 1736
广开言路
广开言路 2020-12-17 20:43

I have a database table which holds each user\'s checkins in cities. I need to know how many days a user has been in a city, and then, how many visits a user has made to a c

5条回答
  •  时光取名叫无心
    2020-12-17 21:29

    Try to apply this code to your task -

    CREATE TABLE visits(
      user_id INT(11) NOT NULL,
      dt DATETIME DEFAULT NULL
    );
    
    INSERT INTO visits VALUES 
      (1, '2011-06-30 12:11:46'),
      (1, '2011-07-01 13:16:34'),
      (1, '2011-07-01 15:22:45'),
      (1, '2011-07-01 22:35:00'),
      (1, '2011-07-02 13:45:12'),
      (1, '2011-08-01 00:11:45'),
      (1, '2011-08-05 17:14:34'),
      (1, '2011-08-05 18:11:46'),
      (1, '2011-08-06 20:22:12'),
      (2, '2011-08-30 16:13:34'),
      (2, '2011-08-31 16:13:41');
    
    
    SET @i = 0;
    SET @last_dt = NULL;
    SET @last_user = NULL;
    
    SELECT v.user_id,
      COUNT(DISTINCT(DATE(dt))) number_of_days,
      MAX(days) number_of_visits
    FROM
      (SELECT user_id, dt
            @i := IF(@last_user IS NULL OR @last_user <> user_id, 1, IF(@last_dt IS NULL OR (DATE(dt) - INTERVAL 1 DAY) > DATE(@last_dt), @i + 1, @i)) AS days,
            @last_dt := DATE(dt),
            @last_user := user_id
       FROM
         visits
       ORDER BY
         user_id, dt
      ) v
    GROUP BY
      v.user_id;
    
    ----------------
    Output:
    
    +---------+----------------+------------------+
    | user_id | number_of_days | number_of_visits |
    +---------+----------------+------------------+
    |       1 |              6 |                3 |
    |       2 |              2 |                1 |
    +---------+----------------+------------------+
    

    Explanation:

    To understand how it works let's check the subquery, here it is.

    SET @i = 0;
    SET @last_dt = NULL;
    SET @last_user = NULL;
    
    
    SELECT user_id, dt,
            @i := IF(@last_user IS NULL OR @last_user <> user_id, 1, IF(@last_dt IS NULL OR (DATE(dt) - INTERVAL 1 DAY) > DATE(@last_dt), @i + 1, @i)) AS 
    
    days,
            @last_dt := DATE(dt) lt,
            @last_user := user_id lu
    FROM
      visits
    ORDER BY
      user_id, dt;
    

    As you see the query returns all rows and performs ranking for the number of visits. This is known ranking method based on variables, note that rows are ordered by user and date fields. This query calculates user visits, and outputs next data set where days column provides rank for the number of visits -

    +---------+---------------------+------+------------+----+
    | user_id | dt                  | days | lt         | lu |
    +---------+---------------------+------+------------+----+
    |       1 | 2011-06-30 12:11:46 |    1 | 2011-06-30 |  1 |
    |       1 | 2011-07-01 13:16:34 |    1 | 2011-07-01 |  1 |
    |       1 | 2011-07-01 15:22:45 |    1 | 2011-07-01 |  1 |
    |       1 | 2011-07-01 22:35:00 |    1 | 2011-07-01 |  1 |
    |       1 | 2011-07-02 13:45:12 |    1 | 2011-07-02 |  1 |
    |       1 | 2011-08-01 00:11:45 |    2 | 2011-08-01 |  1 |
    |       1 | 2011-08-05 17:14:34 |    3 | 2011-08-05 |  1 |
    |       1 | 2011-08-05 18:11:46 |    3 | 2011-08-05 |  1 |
    |       1 | 2011-08-06 20:22:12 |    3 | 2011-08-06 |  1 |
    |       2 | 2011-08-30 16:13:34 |    1 | 2011-08-30 |  2 |
    |       2 | 2011-08-31 16:13:41 |    1 | 2011-08-31 |  2 |
    +---------+---------------------+------+------------+----+
    

    Then we group this data set by user and use aggregate functions: 'COUNT(DISTINCT(DATE(dt)))' - counts the number of days 'MAX(days)' - the number of visits, it is a maximum value for the days field from our subquery.

    That is all;)

提交回复
热议问题