Calculate average monthly total by groups from data.table in R

后端 未结 3 2000
暖寄归人
暖寄归人 2020-12-30 07:29

I have a data.table with a row for each day over a 30 year period with a number of different variable columns. The reason for using data.table is that the .csv file I\'m usi

3条回答
  •  抹茶落季
    2020-12-30 08:02

    Since you said in your question that you would be open to a completely new solution, you could try the following with dplyr:

    df$Date <- as.Date(df$Date, format="%Y-%m-%d")
    df$Year.Month <- format(df$Date, '%Y-%m')
    df$Month <- format(df$Date, '%m')
    
    require(dplyr)
    
    df %>%
      group_by(Key, Year.Month, Month) %>%
      summarize(Runoff = sum(Runoff)) %>%
      ungroup() %>%
      group_by(Key, Month) %>%
      summarize(mean(Runoff))
    

    EDIT #1 after comment by @Henrik: The same can be done by:

    df %>%
      group_by(Key, Month, Year.Month) %>%
      summarize(Runoff = sum(Runoff)) %>%
      summarize(mean(Runoff))
    

    EDIT #2 to round things up: This is another way of doing it (the second grouping is more explicit this way) thanks to @Henrik for his comments

    df %>%
      group_by(Key, Month, Year.Month) %>%
      summarize(Runoff = sum(Runoff)) %>%
      group_by(Key, Month, add = FALSE) %>%    #now grouping by Key and Month, but not Year.Month
      summarize(mean(Runoff))
    

    It produces the following result:

    #Source: local data frame [2 x 3]
    #Groups: Key
    #
    #  Key Month mean(Runoff)
    #1   A    01     4.366667
    #2   B    01     3.266667
    

    You can then reshape the output to match your desired output using e.g. reshape2. Suppose you stored the output of the above operation in a data.frame df2, then you could do:

    require(reshape2)
    
    df2 <- dcast(df2, Key  ~ Month, sum, value.var = "mean(Runoff)")
    

提交回复
热议问题