This is really blowing my mind. The basic loop takes like 8 seconds on my computer:
system.time({
x <- 0
for (p in 1:2) {
for (i in 1:500) {
for (j in 1:5000) {
x <- x + i * j
}
}
}
})
x
Whereas if I use foreach
in non-parallel mode, it does take only 0.7 secs!!!
system.time({
x <- 0
foreach(p = 1:2, .combine = rbind) %do%
for (i in 1:500) {
for (j in 1:5000) {
x <- x + i * j
}
}
})
x
The result is the same, but foreach
was somehow able to reach it much faster than basic R! Where is the inefficiency of basic R?
How is this possible?
In fact, I got complete opposite result compared to this one: Why is foreach() %do% sometimes slower than for?
foreach
when used sequentially eventually uses compiler
to produce compiled byte code using the non-exported functions make.codeBuf
and cmp
. You can use cmpfun
to compile the innerloop into bytecode to simulate this and achieve a similar speedup.
f.original <- function() {
x <- 0
for (p in 1:2) {
for (i in 1:500) {
for (j in 1:5000) {
x <- x + i * j
}
}
}
x
}
f.foreach <- function() {
x <- 0
foreach(p = 1:2, .combine = rbind) %do%
for (i in 1:500) {
for (j in 1:5000) {
x <- x + i * j
}
}
x
}
f.cmpfun <- function(x) {
f <- cmpfun(function(x) {
for (i in 1:500) {
for (j in 1:5000) {
x <- x + i * j
}
}
x
})
f(f(0))
}
Results
library(microbenchmark)
microbenchmark(f.original(),f.foreach(),f.cmpfun(), times=5)
Unit: milliseconds
expr min lq median uq max neval
f.original() 4033.6114 4051.5422 4061.7211 4072.6700 4079.0338 5
f.foreach() 426.0977 429.6853 434.0246 437.0178 447.9809 5
f.cmpfun() 418.2016 427.9036 441.7873 444.1142 444.4260 5
all.equal(f.original(),f.foreach(),f.cmpfun())
[1] TRUE
来源:https://stackoverflow.com/questions/24651664/why-is-r-for-loop-10-times-slower-than-when-using-foreach