I would like to calculate the pairwise euclidean distance matrix. I wrote Rcpp programs by the suggestion of Dirk Eddelbuettel as follows
Nu
First of all, just because you are writing the algorithm using Rcpp does not necessarily mean it will beat out the R equivalent, especially if the R function calls a C or Fortran routine to perform the bulk of the computations. In other cases where the function is written purely in R, there is a high probability that transforming it in Rcpp will yield the desired speed gain.
Remember, when rewriting internal functions, one is going up against the R Core team of absolutely insane C programmers most likely will win out.
dist()Secondly, the distance calculation R uses is done in C as indicated by:
.Call(C_Cdist, x, method, attrs, p)
, which is the last line of the dist() function's R source. This gives it a slight advantage vs. C++ as it more granular instead of templated.
Furthermore, the C implementation uses OpenMP when available to parallelize the computation.
Thirdly, by changing the subset order slightly and avoiding creating an additional variable, the timings between versions decrease.
#include
// [[Rcpp::export]]
Rcpp::NumericMatrix calcPWD1 (const Rcpp::NumericMatrix & x){
unsigned int outrows = x.nrow(), i = 0, j = 0;
double d;
Rcpp::NumericMatrix out(outrows,outrows);
for (i = 0; i < outrows - 1; i++){
Rcpp::NumericVector v1 = x.row(i);
for (j = i + 1; j < outrows ; j ++){
d = sqrt(sum(pow(v1-x.row(j), 2.0)));
out(j,i)=d;
out(i,j)=d;
}
}
return out;
}