Possible shortcomings for using JIT with R?

夙愿已清 提交于 2019-11-28 17:48:38

The rpart example given above, no longer seems to be an issue:

library("rpart")
fo = function() {
  for(i in 1:500){
    rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)
  }
}    system.time(fo())
#   user  system elapsed 
#  1.212   0.000   1.206 
compiler::enableJIT(3)
# [1] 3
system.time(fo())
#   user  system elapsed 
#  1.212   0.000   1.210 

I've also tried a number of other examples, such as

  • growing a vector;
  • A function that's just a wrapper around mean

While I don't always get a speed-up, I've never experience a significant slow-down.


R> sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04 LTS

the output of a simple test with rpart could be an advice not to use enableJIT in ALL cases:

library(rpart)
fo <- function() for(i in 1:500){rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)}
system.time(fo())
#User      System verstrichen 
#2.11        0.00        2.11 

require(compiler)
enableJIT(3)
system.time(fo())
#User      System verstrichen 
#35.46        0.00       35.60

Any explanantion?

In principle, once the byte-code is compiled and loaded, it should always be interpreted at least as fast as the original AST interpreter. Some code will benefit from big speedups, this is usually code with a lot of scalar operations and loops where most time is spent in R interpretation (I've seen examples with 10x speedup but arbitrary micro-benchmarks could indeed inflate this as needed). Some code will run at the same speed, this is usually code well vectorized and hence spending nearly no time in interpretation. Now, compilation itself can be slow. Hence, the just in time compiler now does not compile functions when it guesses it won't pay off (and the heuristics change over time, this is already in 3.4.x). The heuristics don't always guess it right, so there may be situations when compilation won't pay off. Typical problematic patterns are code generation, code modification and manipulation of bindings of environments captured in closures.

Packages can be byte-compiled at installation time so that the compilation cost is not paid (repeatedly) at run time, at least for code that is known ahead of time. This is now the default in development version of R. While the loading of compiled code is much faster than compiling it, in some situations one may be loading even code that won't be executed, so there actually may be an overhead, but overall pre-compilation is beneficial. Recently some parameters of the GC have been tuned to reduce the cost of loading code that won't be executed.

My recommendation for package writers would be to use the defaults (just-in-time compilation is now on by default in released versions, byte-compilation at package installation time is now on in the development version). If you find an example where the byte-code compiler does not perform well, please submit a bug report (I've also seen a case involving rpart in earlier versions). I would recommend against code generation and code manipulation and particularly so in hot loops. This includes defining closures, deleting and inserting bindings in environments captured by closures. Definitely one should not do eval(parse(text= in hot loops (and this had been bad already without byte-compilation). It is always better to use branches than to generate new closures (without branches) dynamically. Also it is better to write code with loops than to dynamically generate code with huge expressions (without loops). Now with the byte-code compiler, it is now often ok to write loops operating on scalars in R (the performance won't be as bad as before, so one could more often get away without switching to C for the performance critical parts).

Further to the previous answer, experimentation shows the problem is not with the compilation of the loop, it is with the compilation of closures. [enableJIT(0) or enableJIT(1) leave the code fast, enableJIT(2) slows it down dramatically, and enableJIT(3) is slightly faster than the previous option (but still very slow)]. Also contrary to Hansi's comment, cmpfun slows execution to a similar extent.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!