Compare blitz++, armadillo, boost::MultiArray

醉酒当歌 提交于 2019-11-30 01:24:24

Short answer: ./configure CXX=icpc, found by reading the Blitz++ User's Guide.

Long answer:

To compile blitz++ with intel c++ compiler, a file called bzconfig.h is required in blitz/intel/ folder. But there isn't.

Yes and yes. Blitz++ is supposed to generate the file itself. According to the Blitz++ User's Guide blitz.pdf included in blitz-0.10.tar.gz, section "Installation",

Blitz++ uses GNU Autoconf, which handles rewriting Makefiles for various platforms and compilers.

More accurately, Blitz++ uses the GNU autotools tool chain (automake, autoconf, configure), which can generate makefiles, configure scripts, header files and more. The bzconfig.h files are supposed to be generated by the configure script, which comes with Blitz++, ready to use.

I just copy the one in blitz/ms/bzconfig.h in. That may give an non-optimal configuration.

If "non-optimal" means "non-working" to you, then yes. :-) You need an intel/bzconfig.h that accurately represents your compiler.

Anyone can tell me how to compile blitz++ with intel c++ compiler?

Read and follow the fine manual, in particular the section "Installation" mentioned above.

go into the ‘blitz-VERSION’ directory, and type: ./configure CXX=[compiler] where [compiler] is one of xlc++, icpc, pathCC, xlC, cxx, aCC, CC, g++, KCC, pgCC or FCC. (If you do not choose a C++ compiler, the configure script will attempt to find an appropriate compiler for the current platform.)

Have you done this? For the Intel compiler, you would need to use ./configure CXX=icpc.

In the manual, it said run bzconfig script to get the right bzconfig.h. But I don't understand what it means.

I assume that by "it" you mean "that". What do you mean by "manual"? My copy of the Blitz++ User's Guide does not mention bzconfig. Are you sure that you are using the manual that corresponds to your Blitz++ version?

PS: Looking for "bzconfig" in the contents of blitz-0.10, it looks like "bzconfig" is no longer part of Blitz++, but used to be:

find . -name bzconfig -> No results

find . -print0 | xargs -0 grep -a -i -n -e bzconfig:

./blitz/compiler.h:44:    #error  In <blitz/config.h>: A working template implementation is required by Blitz++ (you may need to rerun the compiler/bzconfig script)

That needs to be updated.

./blitz/gnu/bzconfig.h:4:/* blitz/gnu/bzconfig.h. Generated automatically at end of configure. */
./configure.ac:159:# autoconf replacement of bzconfig

There you have it, these bzconfig.h files should be generated by configure.

./ChangeLog.1:1787: will now replace the old file that was generate with the bzconfig

That may be the change that switched to autoconf.

./INSTALL:107:  2. Go into the compiler subdirectory and run the bzconfig

That needs to be updated. Is this what made you look for bzconfig?

./README:27:compiler      Compiler tests (used with obsolete bzconfig script)  

Needs updating, a compiler directory is no longer included.

johnwbyrd

As far as I can tell, you are judging the performance of each matrix library by measuring the speed of multiplying a single matrix by a scalar. Due to its template-based policy, Armadillo will do a very good job at this by breaking down each multiply into parallelizable code for most compilers.

But I suggest you need to rethink your test scope and methodology. For example, you've left out every BLAS implementation. The BLAS function you'd need would be dscal. A vendor-provided implementation for your specific CPU would probably do a good job.

More relevantly, there are many more things any reasonable vector library would need to be able to do: matrix multiplies, dot products, vector lengths, transposes, and so forth, which aren't addressed by your test. Your test addresses exactly two things: element assignment, which practically speaking is never a bottleneck for vector libraries, and scalar/vector multiplication, which is a BLAS level 1 function provided by every CPU manufacturer.

There is a discussion of BLAS level 1 vs. compiler-emitted code here.

tl:dr; use Armadillo with BLAS and LAPACK native libraries linked in for your platform.

My test showed boost arrays had the same performance as the native/hardcoded C++ code.

You need to compare them using compiler optimisations activated. That is: -O3 -DNDEBUG -DBOOST_UBLAS_NDEBUG -DBOOST_DISABLE_ASSERTS -DARMA_NO_DEBUG ... When I tested (em++), Boost performed at least 10X faster when you deactivate its asserts, enable level 3 optimisation using -O3, etc. Any fair comparison should use these flags.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!