Aggregate R sum

瘦欲@ 提交于 2019-12-17 20:26:53

问题


I'm writting my first program in R and as a newbie I'm having some troubles, hope you can help me.

I've got a data frame like this:

> v1<-c(1,1,2,3,3,3,4)
> v2<-c(13,5,15,1,2,7,4)
> v3<-c(0,3,6,13,8,23,5)
> v4<-c(26,25,11,2,8,1,0)
> datos<-data.frame(v1,v2,v3,v4)
> names(datos)<-c("Position","a1","a2","a3")

> datos
  posicion a1 a2 a3
1        1 13  0 26
2        1  5  3 25
3        2 15  6 11
4        3  1 13  2
5        3  2  8  8
6        3  7 23  1
7        4  4  5  0

What I need is to sum the data in a1, a2 and a3 (in my real case from a1 to a51) grouped by Position. I'm trying with the function aggregate() but it only works for means, not for sums and I don't know why.

Thanks in advance


回答1:


This is fairly straightforward with the plyr library.

library("plyr")
ddply(datos, .(Position), colwise(sum))

If you have additional non-numeric columns that shouldn't be averaged, you can use

ddply(datos, .(Position), numcolwise(sum))



回答2:


You need to tell the aggregate function to use sum, as the default is for it to get the mean of each category. For example:

aggregate(datos[,c("a1","a2","a3")], by=list(datos$Position), "sum")



回答3:


ag_df <-- aggregate(.~Position,data=datos,sum)

should give you a data frame containing the sums of the "a" values for each of the positions. The trick here is the . in the formula represents a list of all the "non-grouping" variables in the formula.

Note that you can get much the same result with:

sumdf <- rowsum(datos,datos$Position,na.rm=T)

Except that includes the sums of the positions as well!

If you DON'T want all non-group columns aggregated, you can use cbind as in:

sumdf1 <- aggregate(cbind(a1,a3)~datos$Position,datos,sum)

That sums only the a1 and a3 columns.



来源:https://stackoverflow.com/questions/7615922/aggregate-r-sum

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!