How to speed up Eigen library's matrix product?

大城市里の小女人 提交于 2019-11-29 10:29:07

问题


I'm studying simple multiplication of two big matrices using the Eigen library. This multiplication appears to be noticeably slower than both Matlab and Python for the same size matrices.

Is there anything to be done to make the Eigen operation faster?

Problem Details

X : random 1000 x 50000 matrix

Y : random 50000 x 300 matrix

Timing experiments (on my late 2011 Macbook Pro)

Using Matlab: X*Y takes ~1.3 sec

Using Enthought Python: numpy.dot( X, Y) takes ~ 2.2 sec

Using Eigen: X*Y takes ~2.7 sec

Eigen Details

You can get my Eigen code (as a MEX function): https://gist.github.com/michaelchughes/4742878

This MEX function reads in two matrices from Matlab, and returns their product.

Running this MEX function without the matrix product operation (ie just doing the IO) produces negligible overhead, so the IO between the function and Matlab doesn't explain the big difference in performance. It's clearly the actual matrix product operation.

I'm compiling with g++, with these optimization flags: "-O3 -DNDEBUG"

I'm using the latest stable Eigen header files (3.1.2).

Any suggestions on how to improve Eigen's performance? Can anybody replicate the gap I'm seeing?

UPDATE The compiler really seems to matter. The original Eigen timing was done using Apple XCode's version of g++: llvm-g++-4.2.

When I use g++-4.7 downloaded via MacPorts (same CXXOPTIMFLAGS), I get 2.4 sec instead of 2.7.

Any other suggestions of how to compile better would be much appreciated.

You can also get raw C++ code for this experiment: https://gist.github.com/michaelchughes/4747789

./MatProdEigen 1000 50000 300

reports 2.4 seconds under g++-4.7


回答1:


First of all, when doing performance comparison, makes sure you disabled turbo-boost (TB). On my system, using gcc 4.5 from macport and without turbo-boost, I get 3.5s, that corresponds to 8.4 GFLOPS while the theoretical peak of my 2.3 core i7 is 9.2GFLOPS, so not too bad.

MatLab is based on Intel MKL, and seeing the reported performance, it clearly uses a multithreaded version. It is unlikely that an small library as Eigen can beat Intel on its own CPU!

Numpy can uses any BLAS library, Atlas, MKL, OpenBLAS, eigen-blas, etc. I guess that in your case it was using Atlas which is fast too.

Finally, here is how you can get better performance: enable multi-threading in Eigen by compiling with -fopenmp. By default Eigen uses for the number of the thread the default number of thread defined by OpenMP. Unfortunately this number corresponds to the number of logic cores, and not physical cores, so make sure hyper-threading is disabled or define the OMP_NUM_THREADS environment variable to the physical number of cores. Here I get 1.25s (without TB), and 0.95s with TB.




回答2:


The reason Matlab is faster is because it uses the Intel MKL. Eigen can use it too (see here), but you of course need to buy it.

That being said, there are a number of reasons Eigen can be slower. To compare python vs matlab vs Eigen, you'd really need to code three equivalent versions of an operations in the respective languages. Also note that Matlab caches results, so you'd really need to start from a fresh Matlab session to be sure its magic isn't fooling you.

Also, Matlab's Mex overhead is not nonexistent. The OP there reports newer versions "fix" the problem, but I'd be surprised if all overhead has been cleared completely.




回答3:


Eigen doesn't take advantage of the AVX instructions that were introduced by Intel with the Sandy Bridge architecture. This probably explains most of the performance difference between Eigen and MATLAB. I found a branch that adds support for AVX at https://bitbucket.org/benoitsteiner/eigen but as far as I can tell it not been merged in the Eigen trunk yet.



来源:https://stackoverflow.com/questions/14783219/how-to-speed-up-eigen-librarys-matrix-product

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!