R: calculate rank sum automatically

感情迁移 提交于 2019-12-12 16:26:48

问题


Given x <- cbind(c(10,15,20,20,25,30,30,30,35,40,40,40,40,45),rep(c('M','F'),7)), I want to calculate the rank sums of of categories M and F automatically, without doing it by hand. The thing I couldn't figure out is how to adjust the rank numbers when there is a tie. In this case, #3 and #4 are both 20 and thus share the rank value of 3.5 (instead of 3 and 4). Likewise #6 ~ #8 have the rank value of 7, and #10 ~ #13 have 11.5. Without this adjustment, the sums would be wrong.

#Wrong

sum(which(x[,2]=='F')) # =56

sum(which(x[,2]=='M')) # =49

#Right

sum(1,3.5,5,7,9,11.5,11.5) # =56.5

sum(2,3.5,7,7,11.5,11.5,14) # =48.5

I've tried table() and duplicated(), but couldn't figure out how to piece things together. Any ideas?

EDIT: My thanks to konvas for suggesting rank(), which works in addition to bgoldst's solution.


回答1:


You can sum() the rank() with aggregate():

x <- data.frame(age=c(10,15,20,20,25,30,30,30,35,40,40,40,40,45),sex=rep(c('M','F'),7));
aggregate(rank(age)~sex, x, sum );
##   sex rank(age)
## 1   F      56.5
## 2   M      48.5



回答2:


With dplyr

library(dplyr)
x <- cbind(c(10,15,20,20,25,30,30,30,35,40,40,40,40,45),rep(c('M','F'),7))
data.frame(x) %>% mutate(rank=rank(X1)) %>% group_by(X2) %>% summarise(sum(rank))



回答3:


Base R, you can use ave:

setNames(unique(ave(rank(x[,1]), x[,2], FUN=sum)), unique(x[,2]))
#    M    F 
# 48.5 56.5 


来源:https://stackoverflow.com/questions/29534780/r-calculate-rank-sum-automatically

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!