Efficiently perform row-wise distribution test

前端 未结 4 626
陌清茗
陌清茗 2021-01-05 02:43

I have a matrix in which each row is a sample from a distribution. I want to do a rolling comparison of the distributions using ks.test and save the test statis

4条回答
  •  Happy的楠姐
    2021-01-05 03:12

    A quick and dirty implementation in Rcpp

    // [[Rcpp::depends(RcppArmadillo)]]
    #include  
    
    double KS(arma::colvec x, arma::colvec y) {
      int n = x.n_rows;
      arma::colvec w = join_cols(x, y);
      arma::uvec z = arma::sort_index(w);
      w.fill(-1); w.elem( find(z <= n-1) ).ones();
      return max(abs(cumsum(w)))/n;
    }
    // [[Rcpp::export]]
    Rcpp::NumericVector K_S(arma::mat mt) {
      int n = mt.n_cols; 
      Rcpp::NumericVector results(n);
      for (int i=1; i

    for matrix of size (400, 30000), it completes under 1s.

    system.time(K_S(t(mt)))[3]
    #elapsed 
    #   0.98 
    

    And the result seems to be accurate.

    set.seed(1942)
    mt <- matrix(rnorm(400*30000), nrow=30000)
    results <- rep(0, nrow(mt))
    for (i in 2 : nrow(mt)) {
      results[i] <- ks.test(x = mt[i - 1, ], y = mt[i, ])$statistic
    }
    result <- K_S(t(mt))
    all.equal(result, results)
    #[1] TRUE
    

提交回复
热议问题