Intel c++ compiler, ICC, seems to ingnore SSE/AVX seetings

后端未结

关注

 1  1905

I have recently downloaded and installed the Intel C++ compiler, Composer XE 2013, for Linux which is free to use for non-commercial development. http://software.intel.com/e

相关标签:

1条回答

滥情空心

2020-12-19 17:15

Two points:

(1) It appears you are using intel intrinsics in your code -- g++ and icpc do not necessarily implement the same intrinsics (but most of them overlap). Check the header files that need to be imported (g++ may need the hint to define the inartistic for you). Does g++ give an error message when it fails?

(2) The compiler flags do does not mean that instructions will be generated (from icpc --help): -msse3 May generate Intel(R) SSE3, SSE2, and SSE instructions

These flags are usually just hints to the compiler. You may want to look at -xHost and -fast.

It seems no matter what options I try it compiles but does not make optimal use of the AVX code.

How have you checked this? You may not see a 4x speedup if there are other bottlenecks (such as memory bandwidth).

EDIT (based on question edits):

It looks like icc scalar is faster than gcc scalar -- it is possible that icc is vectorizing the scalar code. If this is the case, I would not expect a 4x speedup from icc when manually coding the vectorization.

As far the the difference between icc at 5.782332s and gcc at 3.509130s (for nvec 5000000); this is unexpected. I cannot tell based on the information I have what why there is a difference in the runtime between the two compilers. I would recommend looking at the emitted code (http://www.delorie.com/djgpp/v2faq/faq8_20.html) from both compilers. Also, make sure that your measurements are reproducible (e.g. memory layout on multi-socket machines, hot/cold caches, background processes, etc.).

0 讨论(0)
发布评论:

提交评论
- 加载中...