Load constant floats into SSE registers

后端 未结 4 2132
别那么骄傲
别那么骄傲 2021-02-20 04:18

I\'m trying to figure out an efficient way to load compile time constant floats into SSE(2/3) registers. I\'ve tried doing simple code like this,

const __m128 x          


        
4条回答
  •  梦谈多话
    2021-02-20 05:00

    If you want to force it to a single load, you could try (gcc):

    __attribute__((aligned(16))) float vec[4] = { 1.0f, 1.1f, 1.2f, 1.3f };
    __m128 v = _mm_load_ps(vec); // edit by sor: removed the "&" cause its already an address
    

    If you have Visual C++, use __declspec(align(16)) to request the proper constraint.

    On my system, this (compiled with gcc -m32 -msse -O2; no optimization at all clutters the code but still retains the single movaps in the end) creates the following assembly code (gcc / AT&T syntax):

        andl    $-16, %esp
        subl    $16, %esp
        movl    $0x3f800000, (%esp)
        movl    $0x3f8ccccd, 4(%esp)
        movl    $0x3f99999a, 8(%esp)
        movl    $0x3fa66666, 12(%esp)
        movaps  (%esp), %xmm0
    

    Note that it aligns the stackpointer before allocating stackspace and putting the constants in there. Leaving the __attribute__((aligned)) out may, depending on your compiler, create incorrect code that doesn't do this, so beware, and check the disassembly.

    Additionally:
    Since you've been asking for how to put constants into the code, simply try the above with a static qualifier for the float array. That creates the following assembly:

        movaps  vec.7330, %xmm0
        ...
    vec.7330:
        .long   1065353216
        .long   1066192077
        .long   1067030938
        .long   1067869798
    

提交回复
热议问题