Intel c++ compiler, ICC, seems to ingnore SSE/AVX seetings

后端 未结 1 1903
猫巷女王i
猫巷女王i 2020-12-19 16:45

I have recently downloaded and installed the Intel C++ compiler, Composer XE 2013, for Linux which is free to use for non-commercial development. http://software.intel.com/e

相关标签:
1条回答
  • 2020-12-19 17:15

    Two points:

    (1) It appears you are using intel intrinsics in your code -- g++ and icpc do not necessarily implement the same intrinsics (but most of them overlap). Check the header files that need to be imported (g++ may need the hint to define the inartistic for you). Does g++ give an error message when it fails?

    (2) The compiler flags do does not mean that instructions will be generated (from icpc --help): -msse3 May generate Intel(R) SSE3, SSE2, and SSE instructions

    These flags are usually just hints to the compiler. You may want to look at -xHost and -fast.

    It seems no matter what options I try it compiles but does not make optimal use of the AVX code.

    How have you checked this? You may not see a 4x speedup if there are other bottlenecks (such as memory bandwidth).

    EDIT (based on question edits):

    It looks like icc scalar is faster than gcc scalar -- it is possible that icc is vectorizing the scalar code. If this is the case, I would not expect a 4x speedup from icc when manually coding the vectorization.

    As far the the difference between icc at 5.782332s and gcc at 3.509130s (for nvec 5000000); this is unexpected. I cannot tell based on the information I have what why there is a difference in the runtime between the two compilers. I would recommend looking at the emitted code (http://www.delorie.com/djgpp/v2faq/faq8_20.html) from both compilers. Also, make sure that your measurements are reproducible (e.g. memory layout on multi-socket machines, hot/cold caches, background processes, etc.).

    0 讨论(0)
提交回复
热议问题