问题
I have a comparison result of comparison of two floating point operands as follows; What I need to do is based on the result of comparison need to perform the following: i.e:
neon_gt_res = vcgtq_f32(temp1, temp2);
if(neon_gt_res[0]) array[0] |= (unsigned char)0x01;
if(neon_gt_res[1]) array[0] |= (unsigned char)0x02;
if(neon_gt_res[2]) array[0] |= (unsigned char)0x04;
if(neon_gt_res[3]) array[0] |= (unsigned char)0x08;
But writing like this is again equivalent to multiple comparison. How do I optimally write this in neon C intrinsics.
On x86, this would be array[0] |= _mm_movemask_ps(cmp_gt_res);
回答1:
vmov.i32 qmask, #1
vand qres, qmask, qres
vsra.u64 qres, qres, #30
vsli.64 dres_bottom, dres_top, #2
And you have the bits you need at the four least significant bits of qres.
//////////////////////// edit
An improved version of above:
vshr.u64 qres, qres, #31
vsli.64 dres_bot, dres_top, #2
// the four LSBs already contain the bitmap, the rest is optional:
vbic.i16 dres_bot, #0xf0
// you can now use byte 0 of dres_bot as the result.
来源:https://stackoverflow.com/questions/46568992/neon-pack-vector-compare-result-into-bitmap