Grouping every n minutes with dplyr

前端 未结 4 1153
南方客
南方客 2020-12-03 05:11

I have a dataset containing 10 events occuring at a certain time on a given day, with corresponding value for each event:

d1 <- data.frame(date = as.POSIX         


        
4条回答
  •  眼角桃花
    2020-12-03 06:02

    lubridate-dplyr-esque solution.

    library(lubridate)
    library(dplyr)
    d2 <- data.frame(interval = seq(ymd_hms('2010-05-21 00:00:00'), by = '3 min',length.out=(1440/3)))
    d3 <- d1 %>% 
      mutate(interval = floor_date(date, unit="hour")+minutes(floor(minute(date)/3)*3)) %>% 
      group_by(interval) %>% 
      mutate(sumvalue=sum(value))  %>% 
      select(interval,sumvalue) 
    d4 <- merge(d2,d3, all=TRUE) # better if left_join is used
    tail(d4)
    #               interval sumvalue
    #475 2010-05-21 23:42:00       NA
    #476 2010-05-21 23:45:00       NA
    #477 2010-05-21 23:48:00       NA
    #478 2010-05-21 23:51:00       NA
    #479 2010-05-21 23:54:00       NA
    #480 2010-05-21 23:57:00       NA
    d4[450,]
    #               interval sumvalue
    #450 2010-05-21 22:27:00   643426
    

    If you are comfortable working with Date (I am not), you can dispense with lubridate, and replace the final merge with left_join.

提交回复
热议问题