Efficiently perform row-wise distribution test

前端未结

关注

 4  633

陌清茗 2021-01-05 02:43

I have a matrix in which each row is a sample from a distribution. I want to do a rolling comparison of the distributions using ks.test and save the test statis

4条回答

醉酒成梦 (楼主)

2021-01-05 03:15
I was able to compute the pairwise Kruskal-Wallis statistic using ks.test() with rollapplyr().
```
results <- rollapplyr(data = big,
                      width = 2,
                      FUN = function(x) ks.test(x[1, ], x[2, ])$statistic,
                      by.column = FALSE)
```
This gets the expected result, but it's slow for a dataset of your size. Slow slow slow. This may be because ks.test() is computing a lot more than just the statistic at each iteration; it also gets the p-value and does a lot of error checking.

Indeed, if we simulate a large dataset like so:
```
big <- NULL
for (i in 1:400) {
    big <- cbind(big, rnorm(300000))
}
```
The rollapplyr() solution takes a long time; I halted execution after about 2 hours, at which point it had computed nearly all (but not all) results.

It seems that while rollapplyr() is likely faster than a for loop, it will not likely be the best overall solution in terms of performance.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...