Calculate average monthly total by groups from data.table in R

后端未结

关注

 3  2000

暖寄归人 2020-12-30 07:29

I have a data.table with a row for each day over a 30 year period with a number of different variable columns. The reason for using data.table is that the .csv file I\'m usi

3条回答

抹茶落季 (楼主)

2020-12-30 08:02

Since you said in your question that you would be open to a completely new solution, you could try the following with dplyr:

df$Date <- as.Date(df$Date, format="%Y-%m-%d")
df$Year.Month <- format(df$Date, '%Y-%m')
df$Month <- format(df$Date, '%m')

require(dplyr)

df %>%
  group_by(Key, Year.Month, Month) %>%
  summarize(Runoff = sum(Runoff)) %>%
  ungroup() %>%
  group_by(Key, Month) %>%
  summarize(mean(Runoff))

EDIT #1 after comment by @Henrik: The same can be done by:

df %>%
  group_by(Key, Month, Year.Month) %>%
  summarize(Runoff = sum(Runoff)) %>%
  summarize(mean(Runoff))

EDIT #2 to round things up: This is another way of doing it (the second grouping is more explicit this way) thanks to @Henrik for his comments

df %>%
  group_by(Key, Month, Year.Month) %>%
  summarize(Runoff = sum(Runoff)) %>%
  group_by(Key, Month, add = FALSE) %>%    #now grouping by Key and Month, but not Year.Month
  summarize(mean(Runoff))

It produces the following result:

#Source: local data frame [2 x 3]
#Groups: Key
#
#  Key Month mean(Runoff)
#1   A    01     4.366667
#2   B    01     3.266667

You can then reshape the output to match your desired output using e.g. reshape2. Suppose you stored the output of the above operation in a data.frame df2, then you could do:

require(reshape2)

df2 <- dcast(df2, Key  ~ Month, sum, value.var = "mean(Runoff)")

0 讨论(0)

查看其它3个回答