Calculating hourly averages from a multi-year timeseries

╄→гoц情女王★ 提交于 2019-11-30 15:37:21

I predict that ddply and the plyr package are going to be your best friend :). I created a 30 year dataset with hourly random windspeeds between 1 and 10 ms:

begin_date = as.POSIXlt("1990-01-01", tz = "GMT")
# 30 year dataset
dat = data.frame(dt = begin_date + (0:(24*30*365)) * (3600))
dat = within(dat, {
  speed = runif(length(dt), 1, 10)
  unique_day = strftime(dt, "%d-%m")
})
> head(dat)
                   dt unique_day    speed
1 1990-01-01 00:00:00      01-01 7.054124
2 1990-01-01 01:00:00      01-01 2.202591
3 1990-01-01 02:00:00      01-01 4.111633
4 1990-01-01 03:00:00      01-01 2.687808
5 1990-01-01 04:00:00      01-01 8.643168
6 1990-01-01 05:00:00      01-01 5.499421

To calculate the daily normalen (30 year average, this term is much used in meteorology) over this 30 year period:

library(plyr)
res = ddply(dat, .(unique_day), 
            summarise, mean_speed = mean(speed), .progress = "text")
> head(res)
  unique_day mean_speed
1      01-01   5.314061
2      01-02   5.677753
3      01-03   5.395054
4      01-04   5.236488
5      01-05   5.436896
6      01-06   5.544966

This takes just a few seconds on my humble two core AMD, so I suspect just going once through the data is not needed. Multiple of these ddply calls for different aggregations (month, season etc) can be done separately.

You can use substr to extract the part of the date you want, and then use tapply or ddply to aggregate the data.

tapply(
  data.multipleyears$Windspeed, 
  substr( data.multipleyears$DATETIME, 6, 19), 
  mean 
)
# 01-01 01:00:00 02-29 12:00:00 05-03 09:00:00 
#              9              3              5 

library(plyr)
ddply(
  data.multipleyears, 
  .(when=substr(DATETIME, 6, 19)), 
  summarize, 
  Windspeed=mean(Windspeed)
)
#             when Windspeed
# 1 01-01 01:00:00         9
# 2 02-29 12:00:00         3
# 3 05-03 09:00:00         5

It is pretty old post, but I wanted to add. I guess timeAverage in Openair can also be used. In the manual, there are more options for timeAverage function.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!