On my two computers, I tried this code:
N <- 10e3
M <- 2000
X <- matrix(rnorm(N * M), N)
system.time(crossprod(X))
The first one i
tldr: CentOS uses single-threaded OpenBLAS, Linux Mint uses Reference BLAS by default but can use other BLAS versions.
The R packages for CentOS available from EPEL depend on openblas-Rblas
. This seems to be an OpenBLAS build providing BLAS for R. So while it looks like R's BLAS is used, it actually is OpenBLAS. The LAPACK version is always the one provided by R.
On Debian and derived distributions like Mint, r-base-core
depends on
By default these are provided by the reference implementations libblas3
and liblapack3
. These are not particularly fast, but you can replace them easily by installing packages like libopenblas-base
. You have control over the BLAS and LAPACK used on your system via update-alternatives
.
For controlling the number of threads with OpenBLAS I normally use RhpcBLASctl
:
N <- 20000
M <- 2000
X <- matrix(rnorm(N * M), N)
RhpcBLASctl::blas_set_num_threads(2)
system.time(crossprod(X))
#> User System verstrichen
#> 2.492 0.331 1.339
RhpcBLASctl::blas_set_num_threads(1)
system.time(crossprod(X))
#> User System verstrichen
#> 2.319 0.052 2.316
For some reason setting the environment variables OPENBLAS_NUM_THREADS
, GOTO_NUM_THREADS
or OMP_NUM_THREADS
from R does not have the desired effect. On CentOS even RhpcBLASctl
does not help, since the used OpenBLAS is single-threaded.