I\'m trying to compare parallelization options. Specifically, I\'m comparing the standard SNOW and mulitcore implementations to those using d
To start with, you could write your foreach code a bit more concise :
FECltSim <- function(nSims=1000, size=100, mu=0, sigma=1) {
foreach(i=1:nSims, .combine=c) %dopar% {
mean(rnorm(n=size, mean=mu, sd=sigma))
}
}
This gives you a vector, no need to explicitly make it within the loop. Also no need to use cbind, as your result is every time just a single number. So .combine=c will do
The thing with foreach is that it creates quite a lot of overhead to communicate between the cores and get the results of the different cores fit together. A quick look at the profile shows this pretty clearly :
$by.self
self.time self.pct total.time total.pct
$ 5.46 41.30 5.46 41.30
$<- 0.76 5.75 0.76 5.75
.Call 0.76 5.75 0.76 5.75
...
More than 40% of the time it is busy selecting things. It also uses a lot of other functions for the whole operation. Actually, foreach is only advisable if you have relatively few rounds through very time consuming functions.
The other two solutions are built on a different technology, and do far less in R. On a sidenode, snow is actually initially developed to work on clusters more than on single workstations, like multicore is.