SQL to determine minimum sequential days of access?

前端 未结 19 1702
我在风中等你
我在风中等你 2020-12-04 04:58

The following User History table contains one record for every day a given user has accessed a website (in a 24 hour UTC period). It has many thousands of r

19条回答
  •  失恋的感觉
    2020-12-04 05:29

    How about (and please make sure the previous statement ended with a semi-colon):

    WITH numberedrows
         AS (SELECT ROW_NUMBER() OVER (PARTITION BY UserID 
                                           ORDER BY CreationDate)
                    - DATEDIFF(day,'19000101',CreationDate) AS TheOffset,
                    CreationDate,
                    UserID
             FROM   tablename)
    SELECT MIN(CreationDate),
           MAX(CreationDate),
           COUNT(*) AS NumConsecutiveDays,
           UserID
    FROM   numberedrows
    GROUP  BY UserID,
              TheOffset  
    

    The idea being that if we have list of the days (as a number), and a row_number, then missed days make the offset between these two lists slightly bigger. So we're looking for a range that has a consistent offset.

    You could use "ORDER BY NumConsecutiveDays DESC" at the end of this, or say "HAVING count(*) > 14" for a threshold...

    I haven't tested this though - just writing it off the top of my head. Hopefully works in SQL2005 and on.

    ...and would be very much helped by an index on tablename(UserID, CreationDate)

    Edited: Turns out Offset is a reserved word, so I used TheOffset instead.

    Edited: The suggestion to use COUNT(*) is very valid - I should've done that in the first place but wasn't really thinking. Previously it was using datediff(day, min(CreationDate), max(CreationDate)) instead.

    Rob

提交回复
热议问题