Clear upper bytes of __m128i

前端未结

关注

 2  1722

Happy的楠姐 2021-01-13 08:04

How do I clear the 16 - i upper bytes of a __m128i?

I\'ve tried this; it works, but I\'m wondering if there is a better (shorter, faster) w

2条回答

佛祖请我去吃肉 (楼主)

2021-01-13 08:32
If it were normal 64bit values, i'd use something like -
```
    mask = (1 << (i * 8)) - 1;
```
But take care when generalizing this to 128, the internal shift operators aren't necessarily working at these ranges.

For 128b, you could either just build an upper and lower masks, for e.g -
```
    __m128i mask = _mm_set_epi64x( 
       i > 7 ? 0xffffffff : (1 << ((i) * 8)) - 1 
       i > 7 ? (1 << ((i-8) * 8)) - 1 : 0 
    );
```
(assuming I didn't swap the order, check me on this one, i'm not very familiar with these intrinsics) Alternatively, you can do this on a 2-wide uint64 array and load the 128b mask directly from memory using it's address.

However, both these methods don't seem natural like the original one, they just extend the elements from 1 to 8 bytes, but are still partial. It would be much preferable to do a proper shift with a single 128b variable.

I just came across this topic regarding 128b shifts -

Looking for sse 128 bit shift operation for non-immediate shift value

looks like it's possible but i've never used it. You could try the above one-liner with the appropriate SSE intrinsitc from there. I'd give this one a shot -
```
    mask = _mm_slli_si128(1, i); //emmintrin.h shows the second argument is in bytes already
```
And then just subtract one using your preferred way (I'd be surprised if this type supports a plain old operator-)
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...