Seasonal aggregate of monthly data

痞子三分冷 提交于 2019-12-06 11:50:47

Here is one possible approach:

melt the data, as you suggested...

...and, use colsplit to split up the "variable" into the "Mon" and "Year" columns.

library(reshape2)
ddt.m <- melt(df, id = c("x", "y"))
ddt.m <- cbind(ddt.m, colsplit(ddt.m$variable, "\\.", c("Mon", "Year")))

Use factor and levels to get your seasons

(which I've left in the "Mon" column. Oops.)

ddt.m$Mon <- factor(ddt.m$Mon)
levels(ddt.m$Mon) <- list(Winter = month.abb[c(12, 1, 2)],
                          Spring = month.abb[c(3:5)],
                          Summer = month.abb[c(6:8)],
                          Autumn = month.abb[c(9:11)])
head(ddt.m)
#         x        y variable     value    Mon Year
# 1 1214842 991964.4 Jan.2009 -1.332933 Winter 2009
# 2 1220442 991964.4 Jan.2009 -1.345808 Winter 2009
# 3 1226042 991964.4 Jan.2009 -1.314435 Winter 2009
# 4 1231642 991964.4 Jan.2009 -1.236600 Winter 2009
# 5 1237242 991964.4 Jan.2009 -1.261989 Winter 2009
# 6 1242842 991964.4 Jan.2009 -1.306614 Winter 2009

Use dcast to aggregate the data

dfSeasonMean <- dcast(ddt.m, x + y ~ Mon + Year, 
                      value.var="value", fun.aggregate=mean)
head(dfSeasonMean)
#         x        y Winter_2009 Winter_2010 Spring_2009 Spring_2010 Summer_2009
# 1 1214842 991964.4   -1.439480   -1.006512 -0.02509008   0.2823048    1.392440
# 2 1220442 964154.4   -1.457407   -1.039266 -0.04337596   0.2315217    1.422541
# 3 1220442 973424.4   -1.456991   -1.035115 -0.04117584   0.2423561    1.414473
# 4 1220442 982694.4   -1.456479   -1.029627 -0.03799926   0.2544062    1.405813
# 5 1220442 991964.4   -1.456234   -1.027081 -0.03815661   0.2610397    1.400743
# 6 1226042 945614.4   -1.463465   -1.031665 -0.04288670   0.2236609    1.434002
#   Summer_2010 Autumn_2009 Autumn_2010
# 1    1.256840  0.06469363 -0.03823892
# 2    1.263593  0.04521096 -0.04485553
# 3    1.258328  0.04860321 -0.04477636
# 4    1.252779  0.05337575 -0.04729598
# 5    1.247251  0.05742809 -0.05152524
# 6    1.272742  0.04692731 -0.04915314

Convert each year/month name to a zoo yearmon object and add 1/12 to push it to the next month. After adding one month the 4 seasons correspond to calendar quarters so convert to a zoo yearqtr object. Then regress the data against the year quarters with no intercept and the coefficients will be the desired means:

library(zoo)

df0 <- df[-(1:2)]
Y <- format(as.yearqtr(as.yearmon(names(df0), "%b.%Y") + 1/12)) # "2009 Q1" "2009 Q1" ...
cbind(df[1:2], t(coef(lm(t(df0) ~ Y + 0))))

giving the following. Note that the seasons are labelled by the calendar quarter in which the season ends:

  x y   Y2009 Q1   Y2009 Q2    Y2009 Q3   Y2009 Q4   Y2010 Q1    Y2010 Q2    Y2010 Q3  Y2010 Q4   Y2011 Q1
1 1 1  0.4844135 -0.1464000  0.51947463  0.1692510  0.1050269 -0.04095933 -0.07437911 0.2082204  2.1726117
2 2 2  0.2565755  0.0118020  0.21742535 -0.2123555 -0.5336322  0.60078430  0.96374641 0.2276805  0.4755095
3 3 3 -0.8280485  0.6968518 -0.04217937  0.2166059  0.1438897  0.15929437 -0.54387973 0.3439283 -0.7099464

For each period of interest, you could follow an approach like this:

spring_months <- paste0(c("Mar","Apr","May"),".2009"
spring_mean <- rowMeans(df[,spring_months], na.rm=T)

winter_months <- c("Dec.2009","Jan.2010","Feb.2010")
winter_mean <- rowMeans(df[,winter_months], na.rm=T)

Then you can just take those variables and make a data frame with df$x and df$y:

data.frame(x=df$x, y=df$y, spring_mean = spring_mean, winter_mean = winter_mean)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!