How to find the horizontal maximum in a 256-bit AVX vector

前端未结

关注

 3  1395

再見小時候 2020-12-06 10:03

I have a __m256d vector packed with four 64-bit floating-point values.
I need to find the horizontal maximum of the vector\'s elements and store the result in a double-p

3条回答

南方客 (楼主)

2020-12-06 10:54
The general way of doing this for a vector v1 = [A, B, C, D] is
1. Permute v1 to v2 = [C, D, A, B] (swap 0th and 2nd elements, and 1st and 3rd ones)
2. Take the max; i.e. v3 = max(v1,v2). You now have [max(A,C), max(B,D), max(A,C), max(B,D)]
3. Permute v3 to v4, swapping the 0th and 1st elements, and the 2nd and 3rd ones.
4. Take the max again, i.e. v5 = max(v3,v4). Now v5 contains the horizontal max in all of its components.
Specifically for AVX, the permutations can be done with _mm256_permute_pd and the maximums can be done with _mm256_max_pd. I don't have the exact permute masks handy but they should be pretty straightforward to figure out.

Hope that helps.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...