More elegant way to return a sequence of numbers based on booleans?

前端 未结 3 1990
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-30 09:28

Here\'s a sample of booleans I have as part of a data.frame:

atest <- c(FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, TRUE, TRUE,

3条回答
  •  情歌与酒
    2020-12-30 09:51

    Problems like these tend to work well with Rcpp. Borrowing @flodel's code as a framework for benchmarking,

    boolseq.cpp
    -----------
    
    #include 
    using namespace Rcpp;
    
    // [[Rcpp::export]]
    IntegerVector boolSeq(LogicalVector x) {
      int n = x.length();
      IntegerVector output = no_init(n);
      int counter = 1;
      for (int i=0; i < n; ++i) {
        if (!x[i]) {
          counter = 1;
        }
        output[i] = counter;
        ++counter;
      }
      return output;
    }
    
    /*** R
    x <- c(FALSE, sample( c(FALSE, TRUE), 1E5, TRUE ))
    
    f0 <- function(x) sequence(tabulate(cumsum(!x)))
    f1 <- function(x) {i <- seq_along(x); i - cummax(i * !x) + 1L}
    
    library(microbenchmark)
    microbenchmark(f0(x), f1(x), boolSeq(x), times=100)
    
    stopifnot(identical(f0(x), f1(x)))
    stopifnot(identical(f1(x), boolSeq(x)))
    */
    

    sourceCpping it gives me:

    Unit: microseconds
           expr       min        lq     median         uq       max neval
          f0(x) 18174.348 22163.383 24109.5820 29668.1150 78144.411   100
          f1(x)  1498.871  1603.552  2251.3610  2392.1670  2682.078   100
     boolSeq(x)   388.288   426.034   518.2875   571.4235   699.710   100
    

    Less elegant, but pretty darn close to what you were writing with R code.

提交回复
热议问题