SSE 4 instructions generated by Visual Studio 2013 Update 2 and Update 3

后端 未结 1 1575
既然无缘
既然无缘 2020-12-31 03:45

If I compile this code in VS 2013 Update 2 or Update 3: (below comes from Update 3)

#include \"stdafx.h\"
#include 
#include 

         


        
相关标签:
1条回答
  • 2020-12-31 04:42

    This is documented behaviour:

    The Auto-Vectorizer also uses the newer, SSE4.2 instruction set if your computer supports it.

    If you look closer at the code the compiler generates you'll see that the use of the SSE4.2 instructions is dependent on a runtime test:

    cmp DWORD PTR ___isa_available, 2
    jl  SHORT $LN11@Code
    

    The value 2 here apparently means SSE4.2.

    I was however able to confirm the bug in your second example. It turns out the Core 2 PC I was using supports SSE4.1 and the PMAXSD instruction, so I had to test it in on a PC with a Pentium 4 CPU to get the illegal instruction exception. You should submit a bug report to Microsoft Connect. Be sure to mention the specific Core 2 CPU model your example code fails on.

    As for a workaround I can only suggest changing the optimization level for the affected function. Switching from optimizing for speed to optimizing for size seems to generate much the same code as would be used with only SSE2 instructions. You can use #pragma optimize to switch the optimization level like this:

    #pragma optimize("s", on)
    
    long Code(Buffer* buff)
    {
         ...
    }
    
    #pragma optimize("", on)
    

    As documented on this bug report, /d2Qvec-sse2only is an undocumented flag that works on update 3 (and possibly update 2) to prevent the compiler from outputing SSE4 instructions. This can prevent some loops from being vectorized, naturally. /d2Qvec-sse2only may cease to work at any point (it is "subject to future change without notice"), possibly on future versions of VC.

    Microsoft claims that this problem is fixed in Update 4, and in the Update 4 CTP 2 (not for production use).

    0 讨论(0)
提交回复
热议问题