Ordering Permutation in Rcpp i.e. base::order()

前端 未结 3 1478
不知归路
不知归路 2020-12-14 05:26

I have a ton of code using the base::order() command and I am really too lazy to code around that in rcpp. Since Rcpp only supports sort, but not order, I spent 2 m

3条回答
  •  隐瞒了意图╮
    2020-12-14 05:32

    Another solution based on the C++11:

    // [[Rcpp::plugins(cpp11)]]
    #include 
    using namespace Rcpp;
    
    template 
    IntegerVector order_impl(const Vector& x, bool desc) {
        auto n = x.size();
        IntegerVector idx = no_init(n);
        std::iota(idx.begin(), idx.end(), static_cast(1));
        if (desc) {
            auto comparator = [&x](size_t a, size_t b){ return x[a - 1] > x[b - 1]; };
            std::stable_sort(idx.begin(), idx.end(), comparator);
        } else {
            auto comparator = [&x](size_t a, size_t b){ return x[a - 1] < x[b - 1]; };
            std::stable_sort(idx.begin(), idx.end(), comparator);
            // simulate na.last
            size_t nas = 0;
            for (size_t i = 0; i < n; ++i, ++nas)
                if (!Vector::is_na(x[idx[i] - 1])) break;
            std::rotate(idx.begin(), idx.begin() + nas, idx.end());
        }
        return idx;
    }
    
    // [[Rcpp::export]]
    IntegerVector order2(SEXP x, bool desc = false) {
        switch(TYPEOF(x)) {
        case INTSXP: return order_impl(x, desc);
        case REALSXP: return order_impl(x, desc);
        case STRSXP: return order_impl(x, desc);
        default: stop("Unsupported type.");
        }
    }
    
    /***R
    int <- sample.int(1000, 1E5, replace = TRUE)
    dbl <- runif(1E5)
    chr <- sample(letters, 1E5, replace = TRUE)
    library(benchr)
    benchmark(order(int), order2(int))
    benchmark(order(dbl), order2(dbl))
    benchmark(order(chr), order2(chr))
    */
    

    Compare performance:

    R> int <- sample.int(1000, 1E5, replace = TRUE)
    
    R> dbl <- runif(1E5)
    
    R> chr <- sample(letters, 1E5, replace = TRUE)
    
    R> library(benchr)
    
    R> benchmark(order(int), order2(int))
    Benchmark summary:
    Time units : microseconds 
            expr n.eval  min lw.qu median mean up.qu  max  total relative
     order(int)    100  442   452    464  482   486 1530  48200      1.0
    order2(int)    100 5150  5170   5220 5260  5270 6490 526000     11.2
    
    R> benchmark(order(dbl), order2(dbl))
    Benchmark summary:
    Time units : milliseconds 
            expr n.eval   min lw.qu median  mean up.qu  max total relative
     order(dbl)    100 13.90 14.00  14.20 14.80  15.8 17.4  1480     1.98
    order2(dbl)    100  7.11  7.13   7.15  7.26   7.3  8.8   726     1.00
    
    R> benchmark(order(chr), order2(chr))
    Benchmark summary:
    Time units : milliseconds 
            expr n.eval   min lw.qu median  mean up.qu   max total relative
     order(chr)    100 128.0 131.0  133.0 133.0 134.0 148.0 13300     7.34
    order2(chr)    100  17.7  17.9   18.1  18.2  18.3  22.2  1820     1.00
    

    Note that radix method from the base order much faster.

提交回复
热议问题