`tapply()` to return data frame

陌路散爱 提交于 2019-12-21 21:27:17

问题


I have a dataset with a datetime (POSIXct), a "node" (factor) and and a "c" (numeric) columns, for example:

                 date node           c
1 2011-08-14 10:30:00    2 0.051236000
2 2011-08-14 10:30:00    2 0.081230000
3 2011-08-14 10:31:00    1 0.000000000
4 2011-08-14 10:31:00    4 0.001356337
5 2011-08-14 10:31:00    3 0.001356337
6 2011-08-14 10:32:00    2 0.000000000

I need to take the mean of column "c" for all pairs of "date" and "node", so I did this:

tapply(data$c, list(data$node, data$date), mean)

The result I obtain is what I want, but in a strange structure:

num [1:5, 1:8923] 0 0 0.00092 0.00146 NA ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:5] "1" "2" "3" "4" ...
  ..$ : chr [1:8923] "2011-08-14 10:30:00" "2011-08-14 10:31:00" "2011-08-14 10:32:00" "2011-08-14 10:33:00" ...

Where an example output would be:

  2011-08-17 23:56:00 2011-08-17 23:57:00 2011-08-17 23:58:00
1        4.759077e-05        4.759077e-05        4.759077e-05
2        0.000000e+00        3.875248e-05        1.595690e-04
3        1.134391e-03        1.134391e-03        1.109730e-03
4        4.882813e-04        6.914658e-04        4.955846e-04
5        0.000000e+00        0.000000e+00        0.000000e+00

What I was going for was something like the original structure, with a datetime, the node factor and the "c" value. I cannot figure out how to achieve this. Any help would be appreciated.

Many thanks.


回答1:


You might try...

aggregate( c ~ node + date, data = data, FUN = mean )



回答2:


If you want output that's a data frame with three columns, you probably would benefit from looking at the plyr package (assuming your data are stored in dat):

library(plyr)
ddply(dat,.(date,node),summarise,m = mean(c))



回答3:


Instead of tapply you want to use ave

data$grp.mean <- ave(data$c, list(data$node, data$date), FUN= mean)

Looking again at this I am wondering if you wanted to have the aggregation done on the basis of "date" in the calendar sense of 24 hours?

If you wanted to use the results you already have (assuming they are named "M") you might want to try :

require(reshape2)
newdf <- melt(t(M))


来源:https://stackoverflow.com/questions/7356522/tapply-to-return-data-frame

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!