Testing NEON SIMD registers for equality over all lanes

巧了我就是萌 提交于 2019-12-11 04:34:54

问题


I'm using Neon Instrinics with clang.

I want to test two uint32x4_t SIMD values for equality over all lanes. So not 4 test results, but one single result that tells me if A and B are equal for all lanes.

On Intel AVX, I would use something like:

_mm256_testz_si256( _mm256_xor_si256( A, B ), _mm256_set1_epi64x( -1 ) )

What would be a good way to perform an all-lane equality test for NEON SIMD?

I am assuming I will need intrinsics that operate across lanes. Does ARM Neon have those features?


回答1:


Try this:

uint16x4_t t = vqmovn_u32(veorq_u32(a, b));
vget_lane_u64(vreinterpret_u64_u16(t), 0) == 0

I expect the compiler to find target-specific optimizations when implementing that test.


I just realised something handy...

If you want to test that all lanes are less than some power of two, you can do this by replacing vqmovn_u32() with vqshrn_n_u32(); and I believe this can be extended to being within +/- a power of two (including the lower bound, excluding the upper bound) for signed types using vqrshrn_n_s32(). For example, you should be able to accept both -1 and 0 in a single test using vqrshrn_n_s32(x, 1).




回答2:


If your just want to know if two vectors are equal or not, try following code:

result = vceqq_u32(a, b);
if (vminvq_u32(result ) != 0xffffffff) {
     // not equal
} else {
     // equal
}

See ARM's manual: CMEQ and UMINV



来源:https://stackoverflow.com/questions/41005281/testing-neon-simd-registers-for-equality-over-all-lanes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!