问题
I have a data.frame from this code:
my_df = data.frame("read_time" = c("2010-02-15", "2010-02-15",
"2010-02-16", "2010-02-16",
"2010-02-16", "2010-02-17"),
"OD" = c(0.1, 0.2, 0.1, 0.2, 0.4, 0.5) )
which produces this:
> my_df
read_time OD
1 2010-02-15 0.1
2 2010-02-15 0.2
3 2010-02-16 0.1
4 2010-02-16 0.2
5 2010-02-16 0.4
6 2010-02-17 0.5
I want to average the OD column over each distinct read_time (notice some are replicated others are not) and I also would like to calculate the standard deviation, producing a table like this:
> my_df
read_time OD stdev
1 2010-02-15 0.15 0.05
5 2010-02-16 0.3 0.1
6 2010-02-17 0.5 0
Which are the best functions to deal with concatenating such values in a data.frame?
回答1:
The plyr package is popular for this, but the base functions by()
and aggregate()
will also help.
> ddply(my_df, "read_time", function(X) data.frame(OD=mean(X$OD),stdev=sd(X$OD)))
read_time OD stdev
1 2010-02-15 0.15000 0.07071
2 2010-02-16 0.23333 0.15275
3 2010-02-17 0.50000 NA
You can add the missing bit to return 0 instead of NA for the last std.dev.
Also, you don't need the quotes (on the variables) you had in the data.frame construction.
回答2:
You can try the package data.table. If you know MySQL it should be very easy for you to get all the functions, otherwise the basics are good enough too ;-)
my_dfdt<-data.table(my_df)
mean<-my_dfdt[,mean(OD), by="read_time"]
sd<- ..
you can also join both in one line or to cbind at the end, your call of style
Another advantage: it is extremely fast, if you have large samples. Very fast...see documentation why.
回答3:
This illustrates how you could use aggregate
to get the mean and standard deviation by your read_time
.
>aggregate(my_df$OD, by=list(my_df$read_time), function(x) mean(x))
Group.1 x
1 2010-02-15 0.1500000
2 2010-02-16 0.2333333
3 2010-02-17 0.5000000
>aggregate(my_df$OD, by=list(my_df$read_time), function(x) sd(x))
Group.1 x
1 2010-02-15 0.07071068
2 2010-02-16 0.15275252
3 2010-02-17 NA
来源:https://stackoverflow.com/questions/2473659/r-what-are-the-best-functions-to-deal-with-concatenating-and-averaging-values-i