Why is R for loop 10 times slower than when using foreach?

青春壹個敷衍的年華 提交于 2019-12-06 19:46:10

问题


This is really blowing my mind. The basic loop takes like 8 seconds on my computer:

system.time({
x <- 0
for (p in 1:2) {
    for (i in 1:500) {
        for (j in 1:5000) {
            x <- x + i * j
        }
    }
}
})
x

Whereas if I use foreach in non-parallel mode, it does take only 0.7 secs!!!

system.time({
x <- 0
foreach(p = 1:2, .combine = rbind) %do% 
    for (i in 1:500) {
        for (j in 1:5000) {
            x <- x + i * j
        }
    }
})
x

The result is the same, but foreach was somehow able to reach it much faster than basic R! Where is the inefficiency of basic R?

How is this possible?

In fact, I got complete opposite result compared to this one: Why is foreach() %do% sometimes slower than for?


回答1:


foreach when used sequentially eventually uses compiler to produce compiled byte code using the non-exported functions make.codeBuf and cmp. You can use cmpfun to compile the innerloop into bytecode to simulate this and achieve a similar speedup.

f.original <- function() {
x <- 0
for (p in 1:2) {
    for (i in 1:500) {
        for (j in 1:5000) {
            x <- x + i * j
        }
    }
}
x
}

f.foreach <- function() {
x <- 0
foreach(p = 1:2, .combine = rbind) %do% 
    for (i in 1:500) {
        for (j in 1:5000) {
            x <- x + i * j
        }
    }
x
}

f.cmpfun <- function(x) {
f <- cmpfun(function(x) {
    for (i in 1:500) {
        for (j in 1:5000) {
            x <- x + i * j
            }
        }
        x
    })
    f(f(0))
}

Results

library(microbenchmark)
microbenchmark(f.original(),f.foreach(),f.cmpfun(), times=5)
Unit: milliseconds
         expr       min        lq    median        uq       max neval
 f.original() 4033.6114 4051.5422 4061.7211 4072.6700 4079.0338     5
  f.foreach()  426.0977  429.6853  434.0246  437.0178  447.9809     5
   f.cmpfun()  418.2016  427.9036  441.7873  444.1142  444.4260     5
all.equal(f.original(),f.foreach(),f.cmpfun())
[1] TRUE


来源:https://stackoverflow.com/questions/24651664/why-is-r-for-loop-10-times-slower-than-when-using-foreach

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!