问题
Is there a single instruction to calculate the sum of all components of a float4
, e.g., in OpenCL?
float4 v;
float desiredResult = v.x + v.y + v.z + v.w;
回答1:
float4 v;
float desiredResult = dot(v, (float4)(1.0f, 1.0f, 1.0f, 1.0f));
It's a little more work, because you're multiplying each component by one before adding them, but some GPUs have a dot product instruction built in. So might be faster; might be slower. It depends on your hardware.
来源:https://stackoverflow.com/questions/10811413/sum-vector-components-in-opencl-sse-like