I have the data set like below and i want to calculate the average time difference for each unique id
data:
membership_id created_date
1 12000000 2015
Coming from plyr, you can probably transition very easily to dplyr. It won't be quite as fast as data table, but it will be much faster than ddply.
dat %>% group_by(membership_id) %>%
arrange(created_date) %>%
summarize(avg = as.numeric(mean(diff(created_date))))
# Source: local data frame [3 x 2]
#
# membership_id avg
# (int) (dbl)
# 1 12000000 555
# 2 12000001 262
# 3 12000003 391
Without any more real effort, you can speed things up even more by converting to a data.table object but still use the dplyr commands. Pure data.table will still be even faster.
(Using this data)
dat = structure(list(membership_id = c(12000000L, 12000001L, 12000001L,
12000001L, 12000001L, 12000003L, 12000003L, 12000000L, 12000000L
), created_date = structure(c(16455, 15663, 15985, 16135, 16449,
15744, 16135, 16106, 15345), class = "Date")), .Names = c("membership_id",
"created_date"), row.names = c("1", "2", "3", "4", "5", "6",
"7", "8", "9"), class = "data.frame")