Running compiled C++ code with Rcpp

前端 未结 3 626
清歌不尽
清歌不尽 2021-02-06 00:53

I have been working my way through Dirk Eddelbuettel\'s Rcpp tutorial here:

http://www.rinfinance.com/agenda/

I have learned how to save a C++ file

3条回答
  •  感动是毒
    2021-02-06 00:56

    Thank you to user1981275, Dirk Eddelbuettel and Romain Francois for their responses. Below is how I compiled a C++ file and created a *.dll, then called and used that *.dll file inside R.

    Step 1. I created a new folder called 'c:\users\mmiller21\myrpackages' and pasted the file 'logabs2.cpp' into that new folder. The file 'logabs2.cpp' was created as described in my original post.

    Step 2. Inside the new folder I created a new R package called 'logabs2' using an R file I wrote called 'new package creation.r'. The contents of 'new package creation.r' are:

    setwd('c:/users/mmiller21/myrpackages/')
    
    library(Rcpp)
    
    Rcpp.package.skeleton("logabs2", example_code = FALSE, cpp_files = c("logabs2.cpp"))
    

    I found the above syntax for Rcpp.package.skeleton on one of Hadley Wickham's websites: https://github.com/hadley/devtools/wiki/Rcpp

    Step 3. I installed the new R package "logabs2" in R using the following line in the DOS command window:

    C:\Program Files\R\R-3.0.1\bin\x64>R CMD INSTALL -l c:\users\mmiller21\documents\r\win-library\3.0\ c:\users\mmiller21\myrpackages\logabs2
    

    where:

    the location of the rcmd.exe file is:

    C:\Program Files\R\R-3.0.1\bin\x64>
    

    the location of installed R packages on my computer is:

    c:\users\mmiller21\documents\r\win-library\3.0\
    

    and the location of my new R package prior to being installed is:

    c:\users\mmiller21\myrpackages\
    

    Syntax used in the DOS command window was found by trial and error and may not be ideal. At some point I pasted a copy of 'logabs2.cpp' in 'C:\Program Files\R\R-3.0.1\bin\x64>' but I do not think that mattered.

    Step 4. After installing the new R package I ran it using an R file I named 'new package usage.r' in the 'c:/users/mmiller21/myrpackages/' folder (although I do not think the folder was important). The contents of 'new package usage.r' are:

    library(logabs2)
    logabs2(seq(-5, 5, by=2))
    

    The output was:

    # [1] 1.609438 1.098612 0.000000 0.000000 1.098612 1.609438
    

    This file loaded the package Rcpp without me asking.

    In this case base R was faster assuming I did this correctly.

    #> microbenchmark(logabs2(seq(-5, 5, by=2)), times = 100)
    #Unit: microseconds
    #                        expr    min     lq  median     uq     max neval
    # logabs2(seq(-5, 5, by = 2)) 43.086 44.453 50.6075 69.756 190.803   100
    
    #> microbenchmark(log(abs(seq(-5, 5, by=2))), times=100)
    #Unit: microseconds
    #                         expr    min     lq median    uq     max neval
    # log(abs(seq(-5, 5, by = 2))) 38.298 38.982 39.666 40.35 173.023   100
    

    However, using the dll file was faster than calling the external cpp file:

    system.time(
    
    cppFunction("
    NumericVector logabs(NumericVector x) {
        return log(abs(x));
    }
    ")
    
    )
    
    #   user  system elapsed 
    #   0.06    0.08    5.85 
    

    Although base R seems faster or as fast as the *.dll file in this case, I have no doubt that using the *.dll file with Rcpp will be faster than base R in most cases.

    This was my first attempt creating an R package or using Rcpp and no doubt I did not use the most efficient methods. Also, I apologize for any typographic errors in this post.

    EDIT

    In a comment below I think Romain Francois suggested I modify the *.cpp file to the following:

    #include 
    using namespace Rcpp;
    
    // [[Rcpp::export]]
    
    NumericVector logabs(NumericVector x) {
    return log(abs(x));
    }
    

    and recreate my R package, which I have now done. I then compared base R against my new package using the following code:

    library(logabs)
    
    logabs(seq(-5, 5, by=2))
    log(abs(seq(-5, 5, by=2)))
    
    library(microbenchmark)
    
    microbenchmark(logabs(seq(-5, 5, by=2)), log(abs(seq(-5, 5, by=2))), times = 100000)
    

    Base R is still a tiny bit faster or no different:

    Unit: microseconds
                             expr    min     lq median     uq       max neval
       logabs(seq(-5, 5, by = 2)) 42.401 45.137 46.505 69.073 39754.598 1e+05
     log(abs(seq(-5, 5, by = 2))) 37.614 40.350 41.718 62.234  3422.133 1e+05
    

    Perhaps this is because base R is already vectorized. I suspect with more complex functions base R will be much slower. Or perhaps I am still not using the most efficient approach, or perhaps I simply made an error somewhere.

提交回复
热议问题