SQL to determine minimum sequential days of access?

前端未结

关注

 19  1693

The following User History table contains one record for every day a given user has accessed a website (in a 24 hour UTC period). It has many thousands of r

相关标签:

19条回答

一整个雨季

2020-12-04 05:22

declare @startdate as datetime, @days as int
set @startdate = cast('11 Jan 2009' as datetime) -- The startdate
set @days = 5 -- The number of consecutive days

SELECT userid
      ,count(1) as [Number of Consecutive Days]
FROM UserHistory
WHERE creationdate >= @startdate
AND creationdate < dateadd(dd, @days, cast(convert(char(11), @startdate, 113)  as datetime))
GROUP BY userid
HAVING count(1) >= @days

The statement cast(convert(char(11), @startdate, 113) as datetime) removes the time part of the date so we start at midnight.

I would assume also that the creationdate and userid columns are indexed.

I just realized that this won't tell you all the users and their total consecutive days. But will tell you which users will have been visiting a set number of days from a date of your choosing.

Revised solution:

declare @days as int
set @days = 30
select t1.userid
from UserHistory t1
where (select count(1) 
       from UserHistory t3 
       where t3.userid = t1.userid
       and t3.creationdate >= DATEADD(dd, DATEDIFF(dd, 0, t1.creationdate), 0) 
       and t3.creationdate < DATEADD(dd, DATEDIFF(dd, 0, t1.creationdate) + @days, 0) 
       group by t3.userid
) >= @days
group by t1.userid

I've checked this and it will query for all users and all dates. It is based on Spencer's 1st (joke?) solution, but mine works.

Update: improved the date handling in the second solution.

0 讨论(0)

长情又很酷

2020-12-04 05:23
Doing this with a single SQL query seems overly complicated to me. Let me break this answer down in two parts.
1. What you should have done until now and should start doing now:
  Run a daily cron job that checks for every user wether he has logged in today and then increments a counter if he has or sets it to 0 if he hasn't.
2. What you should do now:
  - Export this table to a server that doesn't run your website and won't be needed for a while. ;)
  - Sort it by user, then date.
  - go through it sequentially, keep a counter...
0 讨论(0)
发布评论:

提交评论
- 加载中...

死守一世寂寞

2020-12-04 05:24

A couple of SQL Server 2012 options (assuming N=100 below).

;WITH T(UserID, NRowsPrevious)
     AS (SELECT UserID,
                DATEDIFF(DAY, 
                        LAG(CreationDate, 100) 
                            OVER 
                                (PARTITION BY UserID 
                                     ORDER BY CreationDate), 
                         CreationDate)
         FROM   UserHistory)
SELECT DISTINCT UserID
FROM   T
WHERE  NRowsPrevious = 100

Though with my sample data the following worked out more efficient

;WITH U
         AS (SELECT DISTINCT UserId
             FROM   UserHistory) /*Ideally replace with Users table*/
    SELECT UserId
    FROM   U
           CROSS APPLY (SELECT TOP 1 *
                        FROM   (SELECT 
                                       DATEDIFF(DAY, 
                                                LAG(CreationDate, 100) 
                                                  OVER 
                                                   (ORDER BY CreationDate), 
                                                 CreationDate)
                                FROM   UserHistory UH
                                WHERE  U.UserId = UH.UserID) T(NRowsPrevious)
                        WHERE  NRowsPrevious = 100) O

Both rely on the constraint stated in the question that there is at most one record per day per user.

0 讨论(0)

失恋的感觉

2020-12-04 05:26

How about one using Tally tables? It follows a more algorithmic approach, and execution plan is a breeze. Populate the tallyTable with numbers from 1 to 'MaxDaysBehind' that you want to scan the table (ie. 90 will look for 3 months behind,etc).

declare @ContinousDays int
set @ContinousDays = 30  -- select those that have 30 consecutive days

create table #tallyTable (Tally int)
insert into #tallyTable values (1)
...
insert into #tallyTable values (90) -- insert numbers for as many days behind as you want to scan

select [UserId],count(*),t.Tally from HistoryTable 
join #tallyTable as t on t.Tally>0
where [CreationDate]> getdate()-@ContinousDays-t.Tally and 
      [CreationDate]<getdate()-t.Tally 
group by [UserId],t.Tally 
having count(*)>=@ContinousDays

delete #tallyTable

0 讨论(0)

隐瞒了意图╮

2020-12-04 05:27

If this is so important to you, source this event and drive a table to give you this info. No need to kill the machine with all those crazy queries.

0 讨论(0)
发布评论:

提交评论
- 加载中...

时光说笑

2020-12-04 05:28

You could use a recursive CTE (SQL Server 2005+):

WITH recur_date AS (
        SELECT t.userid,
               t.creationDate,
               DATEADD(day, 1, t.created) 'nextDay',
               1 'level' 
          FROM TABLE t
         UNION ALL
        SELECT t.userid,
               t.creationDate,
               DATEADD(day, 1, t.created) 'nextDay',
               rd.level + 1 'level'
          FROM TABLE t
          JOIN recur_date rd on t.creationDate = rd.nextDay AND t.userid = rd.userid)
   SELECT t.*
    FROM recur_date t
   WHERE t.level = @numDays
ORDER BY t.userid

0 讨论(0)