Msql: Counting growth over time

…衆ロ難τιáo~ 提交于 2019-12-11 00:59:51

问题


I posted about this a few weeks ago, but I don't think I asked the question clearly because the answers I got were not what I was looking for. I think it's best to start again.

I'm trying to query a database to retrieve the number of unique entries over time. The data looks something like this:

Day | UserID
1 | A
1 | B
2 | B
3 | A
4 | B
4 | C
5 | D

I'd like the query result to look this this

Time Span | COUNT(DISTINCT UserID)
Day 1 to Day 1 | 2
Day 1 to Day 2 | 2
Day 1 to Day 3 | 2
Day 1 to Day 4 | 3
Day 1 to Day 5 | 4

If I do something like

SELECT COUNT(DISTINCT `UserID`) FROM `table` GROUP BY `Day`

, the distinct counts will not consider user IDs of previous days.

Any Ideas? The data set I'm using is quite large, so multiple-queries and post processing takes a long time (that's how I'm currently doing it).

Thanks


回答1:


You can use a subquery

Sample table

create table visits (day int, userid char(1));
insert visits values
(1,'a'),
(1,'b'),
(2,'b'),
(3,'a'),
(4,'b'),
(4,'c'),
(5,'d');

The query

select d.day, (select count(distinct userid) from visits where day<=d.day)
from (select distinct day from visits) d



回答2:


how about something like this:

SELECT Count(UserID), Day 
FROM     
    (SELECT Count(UserID) as Logons, UserID, Day 
    FROM yourDailyLog
    GROUP BY Day, UserID)
GROUP BY Day

The inner select should eliminate the duplicate visits by a same user on a given day.

Stay away from DISTINCT. It is usually a questionable approach to almost any SQL problem.

Wait: I see now that you want the time period to increase over time. That makes things a little trickier. Why don't you aggregate the rest of this information in code rather than doing it all through sql?



来源:https://stackoverflow.com/questions/5317917/msql-counting-growth-over-time

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!