Calculating hourly averages from a multi-year timeseries

前端 未结 3 1306
栀梦
栀梦 2021-01-03 07:47

I have a dataset filled with the average windspeed per hour for multiple years. I would like to create an \'average year\', in which for each hour the average windspeed for

3条回答
  •  [愿得一人]
    2021-01-03 08:33

    I predict that ddply and the plyr package are going to be your best friend :). I created a 30 year dataset with hourly random windspeeds between 1 and 10 ms:

    begin_date = as.POSIXlt("1990-01-01", tz = "GMT")
    # 30 year dataset
    dat = data.frame(dt = begin_date + (0:(24*30*365)) * (3600))
    dat = within(dat, {
      speed = runif(length(dt), 1, 10)
      unique_day = strftime(dt, "%d-%m")
    })
    > head(dat)
                       dt unique_day    speed
    1 1990-01-01 00:00:00      01-01 7.054124
    2 1990-01-01 01:00:00      01-01 2.202591
    3 1990-01-01 02:00:00      01-01 4.111633
    4 1990-01-01 03:00:00      01-01 2.687808
    5 1990-01-01 04:00:00      01-01 8.643168
    6 1990-01-01 05:00:00      01-01 5.499421
    

    To calculate the daily normalen (30 year average, this term is much used in meteorology) over this 30 year period:

    library(plyr)
    res = ddply(dat, .(unique_day), 
                summarise, mean_speed = mean(speed), .progress = "text")
    > head(res)
      unique_day mean_speed
    1      01-01   5.314061
    2      01-02   5.677753
    3      01-03   5.395054
    4      01-04   5.236488
    5      01-05   5.436896
    6      01-06   5.544966
    

    This takes just a few seconds on my humble two core AMD, so I suspect just going once through the data is not needed. Multiple of these ddply calls for different aggregations (month, season etc) can be done separately.

提交回复
热议问题