I recently discovered that one can use JIT (just in time) compilation with R using the compiler package (I summarizes my findings on this topic in a recent blog post).
One of the questions I was asked is:
Is there any pitfall? it sounds too good to be true, just put one line of code and that's it.
After looking around I could find one possible issue having to do with the "start up" time for the JIT. But is there any other issue to be careful about when using JIT?
I guess that there will be some limitation having to do with R's environments architecture, but I can not think of a simple illustration of the problem off the top of my head, any suggestions or red flags will be of great help?
The rpart
example given above, no longer seems to be an issue:
library("rpart")
fo = function() {
for(i in 1:500){
rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)
}
} system.time(fo())
# user system elapsed
# 1.212 0.000 1.206
compiler::enableJIT(3)
# [1] 3
system.time(fo())
# user system elapsed
# 1.212 0.000 1.210
I've also tried a number of other examples, such as
- growing a vector;
- A function that's just a wrapper around
mean
While I don't always get a speed-up, I've never experience a significant slow-down.
R> sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04 LTS
the output of a simple test with rpart could be an advice not to use enableJIT in ALL cases:
library(rpart)
fo <- function() for(i in 1:500){rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)}
system.time(fo())
#User System verstrichen
#2.11 0.00 2.11
require(compiler)
enableJIT(3)
system.time(fo())
#User System verstrichen
#35.46 0.00 35.60
Any explanantion?
In principle, once the byte-code is compiled and loaded, it should always be interpreted at least as fast as the original AST interpreter. Some code will benefit from big speedups, this is usually code with a lot of scalar operations and loops where most time is spent in R interpretation (I've seen examples with 10x speedup but arbitrary micro-benchmarks could indeed inflate this as needed). Some code will run at the same speed, this is usually code well vectorized and hence spending nearly no time in interpretation. Now, compilation itself can be slow. Hence, the just in time compiler now does not compile functions when it guesses it won't pay off (and the heuristics change over time, this is already in 3.4.x). The heuristics don't always guess it right, so there may be situations when compilation won't pay off. Typical problematic patterns are code generation, code modification and manipulation of bindings of environments captured in closures.
Packages can be byte-compiled at installation time so that the compilation cost is not paid (repeatedly) at run time, at least for code that is known ahead of time. This is now the default in development version of R. While the loading of compiled code is much faster than compiling it, in some situations one may be loading even code that won't be executed, so there actually may be an overhead, but overall pre-compilation is beneficial. Recently some parameters of the GC have been tuned to reduce the cost of loading code that won't be executed.
My recommendation for package writers would be to use the defaults (just-in-time compilation is now on by default in released versions, byte-compilation at package installation time is now on in the development version). If you find an example where the byte-code compiler does not perform well, please submit a bug report (I've also seen a case involving rpart
in earlier versions). I would recommend against code generation and code manipulation and particularly so in hot loops. This includes defining closures, deleting and inserting bindings in environments captured by closures. Definitely one should not do eval(parse(text=
in hot loops (and this had been bad already without byte-compilation). It is always better to use branches than to generate new closures (without branches) dynamically. Also it is better to write code with loops than to dynamically generate code with huge expressions (without loops). Now with the byte-code compiler, it is now often ok to write loops operating on scalars in R (the performance won't be as bad as before, so one could more often get away without switching to C for the performance critical parts).
Further to the previous answer, experimentation shows the problem is not with the compilation of the loop, it is with the compilation of closures. [enableJIT(0) or enableJIT(1) leave the code fast, enableJIT(2) slows it down dramatically, and enableJIT(3) is slightly faster than the previous option (but still very slow)]. Also contrary to Hansi's comment, cmpfun slows execution to a similar extent.
来源:https://stackoverflow.com/questions/10106736/possible-shortcomings-for-using-jit-with-r