R - Faster Way to Calculate Rolling Statistics Over a Variable Interval

后端 未结 4 1448
不知归路
不知归路 2020-12-03 02:00

I\'m curious if anyone out there can come up with a (faster) way to calculate rolling statistics (rolling mean, median, percentiles, etc.) over a variable interval of time (

4条回答
  •  无人及你
    2020-12-03 02:46

    Rcpp is a good approach if speed is your primary concern. I'll use the rolling mean statistic to explain by example.

    Benchmarks: Rcpp versus R

    x = sort(runif(25000,0,4*pi))
    y = sin(x) + rnorm(length(x),0.5,0.5)
    system.time( rollmean_r(x,y,xout=x,width=1.1) )   # ~60 seconds
    system.time( rollmean_cpp(x,y,xout=x,width=1.1) ) # ~0.0007 seconds
    

    Code for Rcpp and R function

    cppFunction('
      NumericVector rollmean_cpp( NumericVector x, NumericVector y, 
                                  NumericVector xout, double width) {
        double total=0;
        unsigned int n=x.size(), nout=xout.size(), i, ledge=0, redge=0;
        NumericVector out(nout);
    
        for( i=0; i width && ledge= (xout[i]-width) & x <= (xout[i]+width)
        out[i] = .Internal(mean( y[window] ))
      }
      return(out)
    }
    

    Now for an explantion of rollmean_cpp. x and y are the data. xout is a vector of points at which the rolling statistic is requested. width is the width*2 of the rolling window. Note that the indeces for the ends of sliding window are stored in ledge and redge. These are essentially pointers to the respective elements in x and y. These indeces could be very beneficial for calling other C++ functions (e.g., median and the like) that take a vector and starting and ending indeces as input.

    For those who want a "verbose" version of rollmean_cpp for debugging (lengthy):

    cppFunction('
      NumericVector rollmean_cpp( NumericVector x, NumericVector y, 
                                  NumericVector xout, double width) {
    
        double total=0, oldtotal=0;
        unsigned int n=x.size(), nout=xout.size(), i, ledge=0, redge=0;
        NumericVector out(nout);
    
    
        for( i=0; i width && ledge

提交回复
热议问题