Openblas, OpenMP, and R is there a decent test?

流过昼夜 提交于 2019-12-06 16:32:50
cbeleites supports Monica

The first thing to find out would be if

  • the optimized BLAS is used at all
    OpenBLAS is already with NUM_TREADS=1 much faster than the default BLAS. Check the times of the m %*% m multiplication.

  • Once you know that OpenBLAS is used, check the number of threads spawned (top or htop)

  • If the optimized BLAS is used, NUM_TREADS threads are spawned but get all executed on the same core, see here: Parallel processing in R limited

The problem with not seeing the expected performance gains from switching to OpenBLAS, may be related to processor affinity. From OpenBLAS Github:

"On Linux, OpenBLAS sets the processor affinity by default. This may cause the conflict with R parallel. You can build the library with NO_AFFINITY=1."

Compiling with NO_AFFINITY=1 flag, disables processor affinity.

https://stat.ethz.ch/pipermail/r-sig-hpc/2012-April/001348.html

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!