问题
I would like to calculate the mean of a data.frame by two variables. See an example of the data.frame (extraction) below:
Station Time Year Month Value
ARO 199501 1995 1 69
ARO 199502 1995 2 87
ARO 199503 1995 3 107
ARO 199601 1996 1 35
ARO 199602 1996 2 46
ARO 199603 1996 3 50
ANT 200401 2004 1 87
ANT 200402 2004 2 115
ANT 200403 2004 3 110
ANT 200501 2005 1 80
ANT 200502 2005 2 122
ANT 200503 2005 3 107
To be more detailed: I would like to calculate the mean value for each Station and Month, so e.g. Mean for ARO in Month 1 = (69+35)/2, Mean for ANT in Month 1 = (87+80)/2
The year doesn't matter since I would like to have the mean for a period of 20 years for every month and station.
My dataframe is huge with 61 stations and 12 months for a timeseries of 20 years each.
I tried several things like split
or aggregate
and ddply
but none of it worked.
At the end I would like to have a new data frame like to following:
Station Month Valuemean
ARO 1 52
ARO 2 66.5
ARO 3 78.5
ANT 1 83.5
ANT 2 118.5
ANT 3 108.5
Would be great if you have some ideas to realize it. Thanks a lot!
PS: I'm a R beginner ;)
回答1:
assuming you data is named df
, you can try aggregate
aggregate(Value~Month+Station, data=df, FUN = mean)
Month Station Value
1 1 ANT 83.5
2 2 ANT 118.5
3 3 ANT 108.5
4 1 ARO 52.0
5 2 ARO 66.5
6 3 ARO 78.5
回答2:
You can use data.table
package:
library(data.table)
setDT(df)[,mean(Value), by=list(Month, Station)]
回答3:
Using the dplyr package, if your data.frame is called dat
:
library(dplyr)
means <- dat %>%
group_by(Station, Month) %>%
summarise(Valuemean = mean(Value, na.rm = TRUE))
来源:https://stackoverflow.com/questions/29895802/calculating-mean-by-selecting-from-two-columns-in-a-data-frame