recursive dependence of data - for loop using rcpp

问题

I've got a function written in Rcpp:

library(Rcpp)
cppFunction("NumericVector MatVecMul_cpp (NumericVector y, double k) {
  int n = y.size();
  NumericVector z(n);
  int i; double *p1, *p2, *end = &z[n];
  double tmp = 1.0;
for (i = 0; i < n; i++) {
  for (p1 = &z[i], p2 = &y[0]; p1 < end; p1++, p2++) *p1 += tmp * (*p2);
  tmp *= k;
}
return z;
}")

Basically the goal of the function is to take a numeric vector and parameter k and to calculate output vector where an i-th element is a sum of i-1-th element multiplied by k and a i-th element of input vector y. However, now I need to make some tweak, i.e. I need to take additional parameter c which would tell that c row after non-zero value in y vector the output vector z should be 0. See desired output below with c = 4, k = 0.9.

structure(list(y = c(0.7, 0, 0, 0, 0, 0, 0, 4, 0, 0, 6, 0, 0, 
0), z = c(0.7, 0.63, 0.567, 0.5103, 0.45927, 0, 0, 4, 3.6, 3.24, 
8.916, 8.0244, 7.22196, 6.499764)), row.names = c(NA, -14L), class = "data.frame")

So once again, the 5-th value of z is 0, because the parameter c is equal to 4 so we doesn't multiply the previous value of z anymore. But the 11-th value of z is 8.916000 as we don't only multiply previous value by 0.9, but also add 6.0 from y column.

I have tried to create a new 0-1 column in data.frame named as c which would indicate if the 0.9 decrease is still considered or not and then tried to adjust above function, but the following didn't work (values of z doesn't reset where c = 0).

cppFunction("NumericVector adjust_cpp (NumericVector y, double k, NumericVector ctrl) {
          int n = y.size();
        NumericVector z(n);
        int i; double *p1, *p2, *p3, *end = &z[n];
        double tmp = 1.0;
        for (i = 0; i < n; i++) {
        for (p1 = &z[i], p2 = &y[0], p3 = &ctrl[0]; p1 < end; p1++, p2++, p3++) {
          *p1 += tmp * (*p2);
          *p1 *= *p3;
        }
        tmp *= k;
        }
        return z;
        }"
)

How can I accomplish that?

structure(list(y = c(0.7, 0, 0, 0, 0, 0, 0, 4, 0, 0, 6, 0, 0, 
0), z = c(0.7, 0.63, 0.567, 0.5103, 0.45927, 0, 0, 4, 3.6, 3.24, 
8.916, 8.0244, 7.22196, 6.499764), ctrl = c(1, 1, 1, 1, 1, 0, 
0, 1, 1, 1, 1, 1, 1, 1)), .Names = c("y", "z", "ctrl"), row.names = c(NA, 
-14L), class = "data.frame")

With above data in R this would be:

fun <- function(y, k, ctrl) {
  n <- length(y)
  z <- numeric(n)
  z[1] <- y[1]
  for (i in 1:(n - 1)) {
    z[i + 1] <- (y[i + 1] + z[i] * k) * ctrl[i + 1]
  } return(z)
}

回答1:

Translating such a simple R function into Rcpp can be done line by line with minimal changes:

#include <Rcpp.h>
using Rcpp::NumericVector;

// [[Rcpp::export]]
NumericVector funC(NumericVector y, double k, NumericVector ctrl) {
  R_xlen_t n = y.length();
  NumericVector z(n);
  z(0) = y(0);
  for (R_xlen_t i = 0; i < n - 1; ++i) {
    z(i + 1) = (y(i + 1) + z(i) * k) * ctrl(i + 1);
  }
  return z;
}

/*** R
df <- structure(list(y = c(0.7, 0, 0, 0, 0, 0, 0, 4, 0, 0, 6, 0, 0, 
0), z = c(0.7, 0.63, 0.567, 0.5103, 0.45927, 0, 0, 4, 3.6, 3.24, 
8.916, 8.0244, 7.22196, 6.499764), ctrl = c(1, 1, 1, 1, 1, 0, 
0, 1, 1, 1, 1, 1, 1, 1)), .Names = c("y", "z", "ctrl"), row.names = c(NA, 
-14L), class = "data.frame")

fun <- function(y, k, ctrl) {
  n <- length(y)
  z <- numeric(n)
  z[1] <- y[1]
  for (i in 1:(n - 1)) {
    z[i + 1] <- (y[i + 1] + z[i] * k) * ctrl[i + 1]
  } 
  return(z)
}

z <- fun(df$y, 0.9, df$ctrl)
all.equal(df$z, z)
z <- funC(df$y, 0.9, df$ctrl)
all.equal(df$z, z)
*/

For the provided vectors with length 14, the R version is still faster on this machine. Duplicating y and ctrl ten times gives vectors, for which Rcpp is already faster.

来源：https://stackoverflow.com/questions/51967783/recursive-dependence-of-data-for-loop-using-rcpp

标签

rcpp