Large performance differences between OS for matrix computation

前端 未结 2 1423
[愿得一人]
[愿得一人] 2020-12-06 07:54

On my two computers, I tried this code:

N <- 10e3
M <- 2000
X <- matrix(rnorm(N * M), N)
system.time(crossprod(X))

The first one i

2条回答
  •  情深已故
    2020-12-06 08:50

    tldr: CentOS uses single-threaded OpenBLAS, Linux Mint uses Reference BLAS by default but can use other BLAS versions.

    The R packages for CentOS available from EPEL depend on openblas-Rblas. This seems to be an OpenBLAS build providing BLAS for R. So while it looks like R's BLAS is used, it actually is OpenBLAS. The LAPACK version is always the one provided by R.

    On Debian and derived distributions like Mint, r-base-core depends on

    • libblas3 | libblas.so.3
    • liblapack3 | liblapack.so.3

    By default these are provided by the reference implementations libblas3 and liblapack3. These are not particularly fast, but you can replace them easily by installing packages like libopenblas-base. You have control over the BLAS and LAPACK used on your system via update-alternatives.

    For controlling the number of threads with OpenBLAS I normally use RhpcBLASctl:

    N <- 20000
    M <- 2000
    X <- matrix(rnorm(N * M), N)
    RhpcBLASctl::blas_set_num_threads(2)
    system.time(crossprod(X))
    #>        User      System verstrichen 
    #>       2.492       0.331       1.339
    RhpcBLASctl::blas_set_num_threads(1)
    system.time(crossprod(X))
    #>        User      System verstrichen 
    #>       2.319       0.052       2.316
    

    For some reason setting the environment variables OPENBLAS_NUM_THREADS, GOTO_NUM_THREADS or OMP_NUM_THREADS from R does not have the desired effect. On CentOS even RhpcBLASctl does not help, since the used OpenBLAS is single-threaded.

提交回复
热议问题