SQL to determine minimum sequential days of access?

前端 未结 19 1692
我在风中等你
我在风中等你 2020-12-04 04:58

The following User History table contains one record for every day a given user has accessed a website (in a 24 hour UTC period). It has many thousands of r

相关标签:
19条回答
  • 2020-12-04 05:09

    Tweaking Bill's query a bit. You might have to truncate the date before grouping to count only one login per day...

    SELECT UserId from History 
    WHERE CreationDate > ( now() - n )
    GROUP BY UserId, 
    DATEADD(dd, DATEDIFF(dd, 0, CreationDate), 0) AS TruncatedCreationDate  
    HAVING COUNT(TruncatedCreationDate) >= n
    

    EDITED to use DATEADD(dd, DATEDIFF(dd, 0, CreationDate), 0) instead of convert( char(10) , CreationDate, 101 ).

    @IDisposable I was looking to use datepart earlier but i was too lazy to look up the syntax so i figured i d use convert instead. I dint know it had a significant impact Thanks! now i know.

    0 讨论(0)
  • 2020-12-04 05:13

    Something like this?

    select distinct userid
    from table t1, table t2
    where t1.UserId = t2.UserId 
      AND trunc(t1.CreationDate) = trunc(t2.CreationDate) + n
      AND (
        select count(*)
        from table t3
        where t1.UserId  = t3.UserId
          and CreationDate between trunc(t1.CreationDate) and trunc(t1.CreationDate)+n
       ) = n
    
    0 讨论(0)
  • 2020-12-04 05:14

    If you can change the table schema, I'd suggest adding a column LongestStreak to the table which you'd set to the number of sequential days ending to the CreationDate. It's easy to update the table at login time (similar to what you are doing already, if no rows exist of the current day, you'll check if any row exists for the previous day. If true, you'll increment the LongestStreak in the new row, otherwise, you'll set it to 1.)

    The query will be obvious after adding this column:

    if exists(select * from table
              where LongestStreak >= 30 and UserId = @UserId)
       -- award the Woot badge.
    
    0 讨论(0)
  • 2020-12-04 05:14

    Some nicely expressive SQL along the lines of:

    select
            userId,
        dbo.MaxConsecutiveDates(CreationDate) as blah
    from
        dbo.Logins
    group by
        userId
    

    Assuming you have a user defined aggregate function something along the lines of (beware this is buggy):

    using System;
    using System.Data.SqlTypes;
    using Microsoft.SqlServer.Server;
    using System.Runtime.InteropServices;
    
    namespace SqlServerProject1
    {
        [StructLayout(LayoutKind.Sequential)]
        [Serializable]
        internal struct MaxConsecutiveState
        {
            public int CurrentSequentialDays;
            public int MaxSequentialDays;
            public SqlDateTime LastDate;
        }
    
        [Serializable]
        [SqlUserDefinedAggregate(
            Format.Native,
            IsInvariantToNulls = true, //optimizer property
            IsInvariantToDuplicates = false, //optimizer property
            IsInvariantToOrder = false) //optimizer property
        ]
        [StructLayout(LayoutKind.Sequential)]
        public class MaxConsecutiveDates
        {
            /// <summary>
            /// The variable that holds the intermediate result of the concatenation
            /// </summary>
            private MaxConsecutiveState _intermediateResult;
    
            /// <summary>
            /// Initialize the internal data structures
            /// </summary>
            public void Init()
            {
                _intermediateResult = new MaxConsecutiveState { LastDate = SqlDateTime.MinValue, CurrentSequentialDays = 0, MaxSequentialDays = 0 };
            }
    
            /// <summary>
            /// Accumulate the next value, not if the value is null
            /// </summary>
            /// <param name="value"></param>
            public void Accumulate(SqlDateTime value)
            {
                if (value.IsNull)
                {
                    return;
                }
                int sequentialDays = _intermediateResult.CurrentSequentialDays;
                int maxSequentialDays = _intermediateResult.MaxSequentialDays;
                DateTime currentDate = value.Value.Date;
                if (currentDate.AddDays(-1).Equals(new DateTime(_intermediateResult.LastDate.TimeTicks)))
                    sequentialDays++;
                else
                {
                    maxSequentialDays = Math.Max(sequentialDays, maxSequentialDays);
                    sequentialDays = 1;
                }
                _intermediateResult = new MaxConsecutiveState
                                          {
                                              CurrentSequentialDays = sequentialDays,
                                              LastDate = currentDate,
                                              MaxSequentialDays = maxSequentialDays
                                          };
            }
    
            /// <summary>
            /// Merge the partially computed aggregate with this aggregate.
            /// </summary>
            /// <param name="other"></param>
            public void Merge(MaxConsecutiveDates other)
            {
                // add stuff for two separate calculations
            }
    
            /// <summary>
            /// Called at the end of aggregation, to return the results of the aggregation.
            /// </summary>
            /// <returns></returns>
            public SqlInt32 Terminate()
            {
                int max = Math.Max((int) ((sbyte) _intermediateResult.CurrentSequentialDays), (sbyte) _intermediateResult.MaxSequentialDays);
                return new SqlInt32(max);
            }
        }
    }
    
    0 讨论(0)
  • 2020-12-04 05:18

    Off the top of my head, MySQLish:

    SELECT start.UserId
    FROM UserHistory AS start
      LEFT OUTER JOIN UserHistory AS pre_start ON pre_start.UserId=start.UserId
        AND DATE(pre_start.CreationDate)=DATE_SUB(DATE(start.CreationDate), INTERVAL 1 DAY)
      LEFT OUTER JOIN UserHistory AS subsequent ON subsequent.UserId=start.UserId
        AND DATE(subsequent.CreationDate)<=DATE_ADD(DATE(start.CreationDate), INTERVAL 30 DAY)
    WHERE pre_start.Id IS NULL
    GROUP BY start.Id
    HAVING COUNT(subsequent.Id)=30
    

    Untested, and almost certainly needs some conversion for MSSQL, but I think that give some ideas.

    0 讨论(0)
  • 2020-12-04 05:21

    Seems like you could take advantage of the fact that to be continuous over n days would require there to be n rows.

    So something like:

    SELECT users.UserId, count(1) as cnt
    FROM users
    WHERE users.CreationDate > now() - INTERVAL 30 DAY
    GROUP BY UserId
    HAVING cnt = 30
    
    0 讨论(0)
提交回复
热议问题