Averaging values in a data frame for a certain hour and month in R

后端 未结 2 1432
爱一瞬间的悲伤
爱一瞬间的悲伤 2021-01-03 18:41

I\'ve been scouring the net but haven\'t found a solution to this quite possibly simple problem.

This is the half-hourly data using the library \'xts\',



        
相关标签:
2条回答
  • 2021-01-03 19:26

    Try

    library(dplyr)
    res <- dat %>% 
               group_by(month=format(datetime, '%m'),
                  #year=format(datetime, '%Y'), #if you need year also
                  # as grouping variable
                  hour=format(as.POSIXct(cut(datetime, breaks='hour')), '%H')) %>%
               summarise(Meanval=mean(val, na.rm=TRUE))   
    
    
     head(res,3)
     #  month hour     Meanval
     #1    01   00 -0.02780036
     #2    01   01 -0.06589948
     #3    01   02 -0.02166218
    

    Update

    If your datetime is POSIXlt you could convert it to POSIXct.

      dat$datetime <- as.POSIXlt(dat$datetime)
    

    By running the above code, I get the error

       # Error: column 'datetime' has unsupported type
    

    You could use mutate and convert the datetime to POSIXct class by as.POSIXct

      res1 <-  dat %>% 
                   mutate(datetime= as.POSIXct(datetime)) %>%
                   group_by(month=format(datetime, '%m'),
                     #year=format(datetime, '%Y'), #if you need year also
                     # as grouping variable
                      hour=format(as.POSIXct(cut(datetime, breaks='hour')), '%H')) %>%
                   summarise(Meanval=mean(val, na.rm=TRUE))  
    

    data

    set.seed(24)
    dat <- data.frame(datetime=seq(Sys.time(), by='1 hour', length.out=2000),
        val=rnorm(2000))
    
    0 讨论(0)
  • 2021-01-03 19:36

    If I understand you correctly, you want to average all the values for a given hour, for all the days in a given month, and do this for all months. So average all the values between midnight and 00:59:59 for all the days in a given month, etc.

    I see that you want to avoid xts but aggregate.zoo(...) was designed for this, and avoids dplyr and cut.

    library(xts)
    # creates sample dataset...
    set.seed(1)
    data <- rnorm(1000)
    data.xts <- as.xts(data, as.POSIXct("2007-08-24 17:30:00") +
                         1800 * (1:length(data)))
    
    # using aggregate.zoo(...)
    as.hourly <- function(x) format(x,"%Y-%m %H")
    result    <- aggregate(data.xts,by=as.hourly,mean)
    result    <- data.frame(result)
    head(result)
    #                result
    # 2007-08 00 0.12236024
    # 2007-08 01 0.41593567
    # 2007-08 02 0.22670817
    # 2007-08 03 0.23402842
    # 2007-08 04 0.22175078
    # 2007-08 05 0.05081899
    
    0 讨论(0)
提交回复
热议问题