How to run tapply() on multiple columns of data frame using R?

末鹿安然 提交于 2019-12-02 20:26:27

That's because tapply works on vectors, and transforms df[,2:10] to a vector. Next to that, sum will give you the total sum, not the sum per column. Use aggregate(), eg :

aggregate(df[,2:10],by=list(df$a), sum)

If you want a list returned, you could use by() for that. Make sure to specify colSums instead of sum, as by works on a splitted dataframe :

by(df[,2:10],df$a,FUN=colSums)

Another possibility is to combine apply and tapply.

apply(df[,-1], 2, function(x) tapply(x, df$a, sum))

Will produce the output (which is a matrix)

    b1  ...   b9
D   sD1 ...  sD9
F   sF1 ...  sF9
R   sR1 ...  sR9

You can then use as.data.frame() to get a data frame as output.

Here is a way to apply data.table to this problem.

library(data.table)
DT <- data.table(df)
DT[, lapply(.SD, sum), by=a]

And here is a dplyr approach

library(dplyr)
df %>% group_by(a) %>% summarise_all(funs(sum))
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!