Calculate average monthly total by groups from data.table in R

后端 未结 3 1991
暖寄归人
暖寄归人 2020-12-30 07:29

I have a data.table with a row for each day over a 30 year period with a number of different variable columns. The reason for using data.table is that the .csv file I\'m usi

3条回答
  •  爱一瞬间的悲伤
    2020-12-30 08:11

    They only way I could think of doing it was in two steps. Probably not the best way, but here goes

    DT[, c("YM", "Month") := list(substr(Date, 1, 7), substr(Date, 6, 7))]
    DT[, Runoff2 := sum(Runoff), by = c("Key", "YM")]
    DT[, mean(Runoff2), by = c("Key", "Month")]
    
    ##   Key Month       V1
    ## 1:   A    01 4.366667
    ## 2:   B    01 3.266667
    

    Just to show another (very similar) way:

    DT[, c("year", "month") := list(year(Date), month(Date))]
    DT[, Runoff2 := sum(Runoff), by=list(Key, year, month)]
    DT[, mean(Runoff2), by=list(Key, month)]
    

    Note that you don't have to create new columns, as by supports expressions as well. That is, you can directly use them in by as follows:

    DT[, Runoff2 := sum(Runoff), by=list(Key, year = year(Date), month = month(Date))]
    

    But since you require to aggregate more than once, it's better (for speed) to store them as additional columns, as @David has shown here.

提交回复
热议问题