Can Rcpp replace unif function in R?

99封情书 提交于 2019-12-04 22:22:21

问题


I have just started using the Rcpp package in R, my learning is inspired by the Advanced R course by Hadley Wickham.

Within R studio I have the following .cpp file. The question is more general but this example helps.

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
NumericVector runifC(int n, double min=0, double max=1) {
  NumericVector out(n);

  for(int i = 0; i < n; ++i) {
    out[i] = min + ((double) rand() / (RAND_MAX)) * (max - min);
  }
  return out;
}

/*** R
library(microbenchmark)
microbenchmark(
  'R unif-1'      = runif(1),
  'C++ unif-1'    = runifC(1),
  'R unif-100'    = runif(100),
  'C++ unif-100'  = runifC(100),
  'R unif-1000'   = runif(1000),
  'C++ unif-1000' = runifC(1000),
  'R unif-100000'   = runif(100000),
  'C++ unif-100000' = runifC(100000)
)
*/

When I source/save the file it shows me the performance output.

Unit: nanoseconds
             expr     min        lq       mean    median        uq     max neval
         R unif-1    2061    2644.5    4000.71    3456.0    4297.0   15402   100
       C++ unif-1     710    1190.0    1815.11    1685.0    2168.5    5776   100
       R unif-100    4717    5566.5    6794.14    6563.0    7435.5   16600   100
     C++ unif-100    1450    1997.5    2663.29    2591.5    3107.0    5307   100
      R unif-1000   28210   29584.5   31310.54   30380.0   31599.0   52879   100
    C++ unif-1000    8292    8951.0   10113.78    9462.5   10121.5   25099   100
    R unif-100000 2642581 2975117.0 3104580.62 3030938.5 3119489.0 5435046   100
  C++ unif-100000  699833  990924.0 1058855.49 1034430.5 1075078.0 1530351   100

I would expect that runif would be a very optimised function but the C++ code runs much more efficiently. I might be naive here, but if there is such a difference in performance then why aren't all applicable R functions rewritten in C++?

It seems so obvious that there are a lot of improvements possible that I feel as if I am missing a huge reason of why not all R functions can be blindly copied to C++ for performance.

edit: for this example it has been shown that the C++ implementation of rand() is slightly flawed. the performance gap that I noticed most used the rand() function. performance of other functions doesn't seem as drastic so i changed the name of the question.


回答1:


Please DO NOT USE rand(). Doing so will kick your package off CRAN too should you submit it.

See eg this C++ reference page for a warning:

Notes

There are no guarantees as to the quality of the random sequence produced. In the past, some implementations of rand() have had serious shortcomings in the randomness, distribution and period of the sequence produced (in one well-known example, the low-order bit simply alternated between 1 and 0 between calls).

If you are interested in alternate random number generators and timing, the Rcpp Gallery.

In general, use the generators provided by R which are of excellent statistical quality, and offered in both scalar and vectorised form ("Rcpp Sugar") by Rcpp.




回答2:


As of R-3.1.1, runif uses the .External interface, which copies its arguments. Luke Tierney changed this to use the .Call interface in R-devel in revision 66110. The .Call interface does not copy its arguments. Rcpp uses the .Call interface.


Your C++ code is still faster under R-devel (using the .Call interface). This is likely because of differences in the random number generator being used. Also, R's functions will generally have more checks than whatever specialized code you write; and those checks take time.



来源:https://stackoverflow.com/questions/26466437/can-rcpp-replace-unif-function-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!