Clear upper bytes of __m128i

前端 未结 2 1722
Happy的楠姐
Happy的楠姐 2021-01-13 08:04

How do I clear the 16 - i upper bytes of a __m128i?

I\'ve tried this; it works, but I\'m wondering if there is a better (shorter, faster) w

2条回答
  •  佛祖请我去吃肉
    2021-01-13 08:32

    If it were normal 64bit values, i'd use something like -

        mask = (1 << (i * 8)) - 1;
    

    But take care when generalizing this to 128, the internal shift operators aren't necessarily working at these ranges.

    For 128b, you could either just build an upper and lower masks, for e.g -

        __m128i mask = _mm_set_epi64x( 
           i > 7 ? 0xffffffff : (1 << ((i) * 8)) - 1 
           i > 7 ? (1 << ((i-8) * 8)) - 1 : 0 
        );
    

    (assuming I didn't swap the order, check me on this one, i'm not very familiar with these intrinsics) Alternatively, you can do this on a 2-wide uint64 array and load the 128b mask directly from memory using it's address.

    However, both these methods don't seem natural like the original one, they just extend the elements from 1 to 8 bytes, but are still partial. It would be much preferable to do a proper shift with a single 128b variable.

    I just came across this topic regarding 128b shifts -

    Looking for sse 128 bit shift operation for non-immediate shift value

    looks like it's possible but i've never used it. You could try the above one-liner with the appropriate SSE intrinsitc from there. I'd give this one a shot -

        mask = _mm_slli_si128(1, i); //emmintrin.h shows the second argument is in bytes already
    

    And then just subtract one using your preferred way (I'd be surprised if this type supports a plain old operator-)

提交回复
热议问题