Changing R's Seed from Rcpp to Guarantee Reproducibility

后端 未结 1 1179
遥遥无期
遥遥无期 2020-12-06 20:39

I am trying to write a function r(d, n) in rcpp. The function returns n random draws from normal distribution N(0, d). This function should be well defined, therefore the fu

相关标签:
1条回答
  • 2020-12-06 21:05

    The issue that I think is happening is you are trying to disperse the randn offered by Armadillo that is restricted to being a standard normal, e.g. N(0,1), such that it matches N(0, d). There are two ways to go about this since it is a standard normal.

    Option 1: Using Statistical Properties

    The first way involves just multiplying the sample by the square root of d, e.g. sqrt(d)*sample. This is possible due to the random variable properties of variance and expectation giving sqrt(d)*N(0, 1) ~ N(0, sqrt(d)^2) ~ N(0, d).

    One of the more important things to note here is that the set_seed() function will work since the Armadillo configuration of RcppArmadillo hooks into R's RNG library to access the ::Rf_runif function to generate random values. The only area of concern is you cannot use arma::arma_rng::set_seed() to set the seed due to limitations of the R/C++ interaction detailed in Section 6.3 of Writing R Extensions. If you do use this, then you would get warned with :

    When called from R, the RNG seed has to be set at the R level via set.seed()

    on the first detected call.

    With this being said, here is a short code example where we multiple by sqrt(d).

    Code:

    #include <RcppArmadillo.h>
    // [[Rcpp::depends(RcppArmadillo)]]
    
    // set seed
    // [[Rcpp::export]]
    void set_seed(double seed) {
        Rcpp::Environment base_env("package:base");
        Rcpp::Function set_seed_r = base_env["set.seed"];
        set_seed_r(std::floor(std::fabs(seed)));
    }
    
    // function r(d, n)
    // [[Rcpp::export]]
    arma::vec randdraw(double d, int n){
        set_seed(d);              // Set a seed for R's RNG library
        // Call Armadillo's RNG procedure that references R's RNG capabilities
        // and change dispersion slightly.
        arma::vec out = std::sqrt(std::fabs(d))*arma::randn(n);
        return out;
    }
    

    Output:

    > randdraw(3.5, 5L)
               [,1]
    [1,] -0.8671559
    [2,] -1.9507540
    [3,]  2.9025090
    [4,] -1.2953745
    [5,]  2.0799176
    

    Note: There is no direct equivalent as the rnorm procedure differs from the arma::randn generation.

    Option 2: Rely upon R's RNG Functions

    The second, and significantly better solution, is to explicitly rely upon R's RNG functions. Previously, we made an implicit use of R's RNG library due to RcppArmadillo's configuration. I tend to prefer this approach as you have already made an assumption that the code is specific to R when using the set_seed() function (Disclaimer: I wrote the post). If you are worried about the restriction of d being an integer, a slight coercion from double to int is possible with std::floor(std::fabs(seed)). Once the values are generated using either Rcpp::r*() or R::r*() , an armadillo vector is created using an advanced ctor that reuses the existing memory allocation.

    Code:

    #include <RcppArmadillo.h>
    // [[Rcpp::depends(RcppArmadillo)]]
    
    // set seed
    // [[Rcpp::export]]
    void set_seed(double seed) {
        Rcpp::Environment base_env("package:base");
        Rcpp::Function set_seed_r = base_env["set.seed"];
        set_seed_r(std::floor(std::fabs(seed)));
    }
    
    // function r(d, n)
    // [[Rcpp::export]]
    arma::vec randdraw(double d, int n){
        set_seed(d);                                      // Set a seed for R's RNG library
        Rcpp::NumericVector draws = Rcpp::rnorm(n, 0.0, d); // Hook into R's Library
        // Use Armadillo's advanced CTOR to re-use memory and cast as an armadillo object.
        arma::vec out = arma::vec(draws.begin(), n, false, true);
        return out;
    }
    

    Output:

    > randdraw(3.21,10)
                 [,1]
     [1,] -3.08780627
     [2,] -0.93900757
     [3,]  0.83071017
     [4,] -3.69834335
     [5,]  0.62846287
     [6,]  0.09669786
     [7,]  0.27419092
     [8,]  3.58431878
     [9,] -3.91253230
    [10,]  4.06825360
    > set.seed(3)
    > rnorm(10, 0, 3.21)
     [1] -3.08780627 -0.93900757  0.83071017 -3.69834335  0.62846287  0.09669786  0.27419092  3.58431878 -3.91253230  4.06825360
    
    0 讨论(0)
提交回复
热议问题