How to aggregate hourly values into 24h-average means without timestamp

问题

I have 'mydata_hourly' with 3 station (actually more) and their hourly temperature values over one year. This gives me 8760 hourly measurements in one year. Now I want to have the same structure but with the (365) 24h-average means 'mydata_daily'.

I have tried something with a for loop, this didn't work out. I have heard something about an aggregate function. I found something with a timestamp, what I don't have unfortunately.

my_data_hourly <- structure(c(8.29, 7.96, 8.14, 7.27, 7.37, 7.3, 7.23, 7.53, 
7.98, 10.2, 12.39, 14.34, 14.87, 14.39, 12.54, 11.84, 10.3, 10.62, 
10.65, 10.56, 10.43, 10.35, 9.85, 9.12, 8.95, 8.82, 8.92, 9.33, 
9.44, 9.3, 9.15, 9.37, 9.54, 10.24, 12.13, 12.43, 12.65, 13, 
13.18, 13.58, 13.64, 13.75, 13.85, 13.94, 13.79, 13.84, 13.94, 
14.26, 24.93, 24.64, 23.67, 21.46, 21.33, 20.83, 21.12, 21.1, 
23.75, 25.39, 30.72, 30.71, 30.81, 30.92, 32.61, 32.37, 32.49, 
30.68, 30.23, 30.45, 28.1, 26.9, 25.09, 25.07, 24.59, 24.22, 
23.05, 22.21, 22.07, 21.6, 21.24, 21.22, 21.85, 24.87, 28.85, 
29.42, 30.82, 30.97, 31.32, 30.81, 30.83, 29.9, 30.01, 30.31, 
30, 27.91, 25.78, 25.88, 8.78, 8.47, 8.49, 7.65, 8.63, 9.02, 
9.02, 8.11, 7.63, 9.19, 11.25, 12.24, 13.62, 12.09, 10.6, 11.1, 
10.16, 10.44, 9.58, 10.04, 10.01, 10.23, 9.51, 9.2, 9.34, 9.6, 
9.4, 9.45, 9.36, 9.26, 9.3, 9.46, 9.58, 9.89, 10.6, 11.04, 12.1, 
12.61, 13.12, 13.47, 13.55, 13.51, 13.63, 13.84, 13.93, 14.17, 
13.97, 13.86), .Dim = c(48L, 3L), .Dimnames = list(NULL, c("station1", 
"station2", "station3")))

hourly_measure    Station1          Station2           Station3
[1,]              8.29             24.93              8.78
[2,]              7.96             24.64              8.47
[3,]              8.14             23.67              8.49
[4,]              7.27             21.46              7.65
[5,]              7.37             21.33              8.63
[6,]              7.30             20.83              9.02
[7,]              7.23             21.12              9.02
[8,]              7.53             21.10              8.11
[9,]              7.98             23.75              7.63
[10,]             10.20            25.39              9.19
[11,]             12.39            30.72             11.25
[12,]             14.34            30.71             12.24
[13,]             14.87            30.81             13.62
[14,]             14.39            30.92             12.09
[15,]             12.54            32.61             10.60
[16,]             11.84            32.37             11.10
[17,]             10.30            32.49             10.16
[18,]             10.62            30.68             10.44
[19,]             10.65            30.23              9.58
[20,]             10.56            30.45             10.04
[21,]             10.43            28.10             10.01
[22,]             10.35            26.90             10.23
[23,]              9.85            25.09              9.51
[24,]              9.12            25.07              9.20
[25,]              8.95            24.59              9.34
[26,]              8.82            24.22              9.60
[27,]              8.92            23.05              9.40
[28,]              9.33            22.21              9.45
[29,]              9.44            22.07              9.36
[30,]              9.30            21.60              9.26
[31,]              9.15            21.24              9.30
[32,]              9.37            21.22              9.46
[33,]              9.54            21.85              9.58
[34,]             10.24            24.87              9.89
[35,]             12.13            28.85             10.60
[36,]             12.43            29.42             11.04
[37,]             12.65            30.82             12.10
[38,]             13.00            30.97             12.61
[39,]             13.18            31.32             13.12
[40,]             13.58            30.81             13.47
[41,]             13.64            30.83             13.55
[42,]             13.75            29.90             13.51
[43,]             13.85            30.01             13.63
[44,]             13.94            30.31             13.84
[45,]             13.79            30.00             13.93
[46,]             13.84            27.91             14.17
[47,]             13.94            25.78             13.97
[48,]             14.26            25.88             13.86

So in theory I want to have mydata_hourly[1:24,1] in my_data_daily[1,1] and mydata_hourly[25:48,1] in mydata_daily[2,1]

回答1:

These are time series and it is probably best to use time series representations for them which will facilitate plotting and other time series processing.

I) ts Suppose your data is the matrix m shown reproducibly in the Note at the end. Convert that to a ts time series with frequency 24 and then aggregate it as shown. No packages are used.

tt <- ts(m, frequency = 24)
aggregate(tt, 1, mean)

giving:

Time Series:
Start = 1 
End = 2 
Frequency = 1 
  Station1 Station2  Station3
1 10.06333 26.89042  9.794167
2 11.71000 25.40542 11.585000

2) zooreg An alternative is to create zooreg objects using the zoo package.

library(zoo)

z <- zooreg(m, frequency = 24)
aggregate(z, as.integer, mean)

giving:

  Station1 Station2  Station3
1 10.06333 26.89042  9.794167
2 11.71000 25.40542 11.585000

Note

Lines <- "
Station1          Station2           Station3
8.29             24.93              8.78
7.96             24.64              8.47
8.14             23.67              8.49
7.27             21.46              7.65
7.37             21.33              8.63
7.30             20.83              9.02
7.23             21.12              9.02
7.53             21.10              8.11
7.98             23.75              7.63
10.20            25.39              9.19
12.39            30.72             11.25
14.34            30.71             12.24
14.87            30.81             13.62
14.39            30.92             12.09
12.54            32.61             10.60
11.84            32.37             11.10
10.30            32.49             10.16
10.62            30.68             10.44
10.65            30.23              9.58
10.56            30.45             10.04
10.43            28.10             10.01
10.35            26.90             10.23
 9.85            25.09              9.51
 9.12            25.07              9.20
 8.95            24.59              9.34
 8.82            24.22              9.60
 8.92            23.05              9.40
 9.33            22.21              9.45
 9.44            22.07              9.36
 9.30            21.60              9.26
 9.15            21.24              9.30
 9.37            21.22              9.46
 9.54            21.85              9.58
10.24            24.87              9.89
12.13            28.85             10.60
12.43            29.42             11.04
12.65            30.82             12.10
13.00             0.97             12.61
13.18            31.32             13.12
13.58            30.81             13.47
13.64            30.83             13.55
13.75            29.90             13.51
13.85            30.01             13.63
13.94            30.31             13.84
13.79            30.00             13.93
13.84            27.91             14.17
13.94            25.78             13.97
14.26            25.88             13.86"
m <- as.matrix(read.table(text = Lines, header = TRUE))

回答2:

One dplyr possibility could be:

df %>%
 group_by(Period = gl(n()/24, 24)) %>%
 summarise_at(-1, mean)

  Period Station1 Station2 Station3
  <fct>     <dbl>    <dbl>    <dbl>
1 1          10.1     26.9     9.79
2 2          11.7     25.4    11.6

来源：https://stackoverflow.com/questions/56476532/how-to-aggregate-hourly-values-into-24h-average-means-without-timestamp

标签

aggregate

mean