Loops inefficiency in R

后端未结

关注

 2  1100

暗喜 2021-01-01 02:59

Good morning,

I have been developing for a few months in R and I have to make sure that the execution time of my code is not too long because I analyze big datasets.

2条回答

佛祖请我去吃肉 (楼主)

2021-01-01 03:37
Just a couple of comments. A for loop is roughly as fast as apply and its variants, and the real speed-ups come when you vectorise your function as much as possible (that is, using low-level loops, rather than apply, which just hides the for loop). I'm not sure if this is the best example, but consider the following:
```
> n <- 1e06
> sinI <- rep(NA,n)
> system.time(for(i in 1:n) sinI[i] <- sin(i))
   user  system elapsed 
  3.316   0.000   3.358 
> system.time(sinI <- sapply(1:n,sin))
   user  system elapsed 
  5.217   0.016   5.311 
> system.time(sinI <- unlist(lapply(1:n,sin),
+       recursive = FALSE, use.names = FALSE))
   user  system elapsed 
  1.284   0.012   1.303 
> system.time(sinI <- sin(1:n))
   user  system elapsed 
  0.056   0.000   0.057 
```
In one of the comments below, Marek points out that the time consuming part of the for loop above is actually the ]<- part:
```
> system.time(sinI <- unlist(lapply(1:n,sin),
+       recursive = FALSE, use.names = FALSE))
   user  system elapsed 
  1.284   0.012   1.303 
```
The bottlenecks which can't immediately be vectorised can be rewritten in C or Fortran, compiled with R CMD SHLIB, and then plugged in with .Call, .C or .Fortran.

Also, see these links for more info about loop optimisation in R. Also check out the article "How Can I Avoid This Loop or Make It Faster?" in R News.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...