Sum Vector Components in OpenCL (SSE-like)

可紊 提交于 2019-12-07 12:54:53

问题


Is there a single instruction to calculate the sum of all components of a float4, e.g., in OpenCL?

float4 v;
float desiredResult = v.x + v.y + v.z + v.w;

回答1:


float4 v;
float desiredResult = dot(v, (float4)(1.0f, 1.0f, 1.0f, 1.0f));

It's a little more work, because you're multiplying each component by one before adding them, but some GPUs have a dot product instruction built in. So might be faster; might be slower. It depends on your hardware.



来源:https://stackoverflow.com/questions/10811413/sum-vector-components-in-opencl-sse-like

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!