Computing multiple variance of a dataset in R

老子叫甜甜 提交于 2019-12-11 03:07:33

问题


My problem is somewhat related to this question.

I have a data as below

V1   V2
..   1
..   2
..   1
..   3

I need to calculate variance of data in V1 for each value of V2 cumulatively (This means that for a particular value of V2 say n,all the rows of V1 having corresponding V2 less than n need to be included.

Will ddply help in such a case?


回答1:


I don't think ddply will help since it is built on the concept of taking non-overlapping subsets of a data frame.

d <- data.frame(V1=runif(1000),V2=sample(1:10,size=1000,replace=TRUE))
u <- sort(unique(d$V2))
ans <- sapply(u,function(x) {
    with(d,var(V1[V2<=x]))
})
names(ans) <- u

I don't know if there's a more efficient way to do this ...



来源:https://stackoverflow.com/questions/12446789/computing-multiple-variance-of-a-dataset-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!