pairwise addition in neon

瘦欲@ 提交于 2019-12-12 03:35:39

问题


I want to add 00 and 01 indices value of int64x2_t vector in neon . I am not able to find any pairwise-add instruction which will do this functionality .

int64x2_t sum_64_2;
//I am expecting result should be.. 
//int64_t result = sum_64_2[0] + sum_64_2[1];
  • Is there any instruction in neon do to this logic.

回答1:


You can write it in two ways. This one explicitly uses the NEON VADD.I64 instruction:

int64x1_t f(int64x2_t v)
{
  return vadd_s64 (vget_high_s64 (v), vget_low_s64 (v));
}

and the following one relies on the compiler to correctly select between using the NEON and general integer instruction sets. GCC 4.9 does the right thing in this case, but other compilers may not.

int64x1_t g(int64x2_t v)
{
  int64x1_t r;
  r=vset_lane_s64(vgetq_lane_s64(v, 0) + vgetq_lane_s64(v, 1), r, 0);
  return r;
}

When targeting ARM, the code generation is efficient. For AArch64, extra instructions are used, but the compiler could do better.



来源:https://stackoverflow.com/questions/30309692/pairwise-addition-in-neon

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!