I want to calculate a count of items over time using their Start and End dates.
Some sample data
START <- as.Date(c(\"2014-01-01\", \"2014-01-02\
Using dplyr and grouped data:
data_frame(
START = as.Date(c("2014-01-01", "2014-01-02","2014-01-03","2014-01-03")),
END = as.Date(c("2014-01-04", "2014-01-03","2014-01-03","2014-01-04"))
) -> df
rbind(cbind(group = 'a', df),cbind(group = 'b', df)) %>% as_data_frame->df
df
df %>%
group_by(.,group) %>%
do(data.frame(table(Reduce(c, Map(seq, .$START, .$END, by = 1)))))
This is a common problem when you for example want to find the number of logins on different pages/machines etc given time-intervals per users
> df
Source: local data frame [8 x 3]
group START END
(chr) (date) (date)
1 a 2014-01-01 2014-01-04
2 a 2014-01-02 2014-01-03
3 a 2014-01-03 2014-01-03
4 a 2014-01-03 2014-01-04
5 b 2014-01-01 2014-01-04
6 b 2014-01-02 2014-01-03
7 b 2014-01-03 2014-01-03
8 b 2014-01-03 2014-01-04
>
> df %>%
+ group_by(.,group) %>%
+ do(data.frame(table(Reduce(c, Map(seq, .$START, .$END, by = 1)))))
Source: local data frame [8 x 3]
Groups: group [2]
group Var1 Freq
(chr) (fctr) (int)
1 a 2014-01-01 1
2 a 2014-01-02 2
3 a 2014-01-03 4
4 a 2014-01-04 2
5 b 2014-01-01 1
6 b 2014-01-02 2
7 b 2014-01-03 4
8 b 2014-01-04 2