Conditional calculating the numbers of values in column with R

别说谁变了你拦得住时间么 提交于 2019-12-02 01:10:23

My command of R code isn't great, so here's A Rather Ugly Function:

ARUF=function(x,y){df1=data.frame(x,y,group=NA);miny=min(y,na.rm=T)
maxy=max(y,na.rm=T);for(i in 1:length(df1$x))df1$group[i]=if(df1$x[i]<=2)'I'else
if(df1$x[i]>2&df1$x[i]<=3)'II'else if(df1$x[i]>3&df1$x[i]<=5)'III'else'NA'
Result1=c();Result2=c();for(i in miny:maxy){for(j in c('I','II','III')){
Result1=append(Result1,length(levels(factor(subset(df1,y==i&group==j)$x))))
Result2=append(Result2,mean(subset(df1,y==i&group==j)$x))}}
print(data.frame(y=rep(miny:maxy,rep(3,maxy+abs(miny-1))),
x=rep(c('I','II','III'),maxy+abs(miny-1)),Result1,Result2),row.names=F)}

With your x and y, ARUF(x,y) prints this data.frame:

y   x Result1  Result2
1   I       2 1.500000
1  II       0      NaN
1 III       1 5.000000
2   I       2 1.250000
2  II       1 3.000000
2 III       1 5.000000
3   I       1 1.000000
3  II       1 3.000000
3 III       0      NaN
4   I       1 2.000000
4  II       0      NaN
4 III       2 4.666667

I went a little out of my way to make ARUF robust with any integer values of y. I can't seem to break it by generating y randomly with rbinom, and I believe it should handle any real number values of x, so it should work for any other vectors of the same kind that you might have.

Mike.Gahan
#Bring in data.table library
require(data.table)
data <- data.table(x,y)

#Summarize data
data[, list(x = mean(x, na.rm=TRUE)), by = 
       list(y, x.grp = cut(x, c(-Inf,2,3,5,Inf)))][order(y,x.grp)]

If you'd like the results to be NA when NAs are present, then just remove na.rm=TRUE from mean(.):

data[, list(x = mean(x)), by = 
       list(y, x.grp = cut(x, c(-Inf,2,3,5,Inf)))][order(y,x.grp)]
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!