float VS floatN

天大地大妈咪最大 提交于 2019-12-10 10:10:07

问题


Is there any advantage when using floatN instead float in OpenCL?

for example

float3 position;

and

float posX, posY, posZ;

Thank you


回答1:


It depends on the hardware.

NVidia GPUs have a scalar architecture, so vectors provide little advantage on them over writing purely scalar code. Quoting the NVidia OpenCL best practices guide (PDF link):

The CUDA architecture is a scalar architecture. Therefore, there is no performance benefit from using vector types and instructions. These should only be used for convenience. It is also in general better to have more work-items than fewer using large vectors.

With CPUs and ATI GPUs, you will gain more benefits from using vectors as these architectures have vector instructions (though I've heard this might be different on the latest Radeons - wish I had a link to the article where I read this).

Quoting the ATI Stream OpenCL programming guide (PDF link), for CPUs:

The SIMD floating point resources in a CPU (SSE) require the use of vectorized types (float4) to enable packed SSE code generation and extract good performance from the SIMD hardware.

This article provides a performance comparison on ATI GPUs of a kernel written with vectors vs pure scalar types.




回答2:


In both Nvidia and AMD architectures, the memory is divided into banks of 128 bits. Often, reading a single float3 or float4 value is going to be faster for the memory controller than reading 3 separate floats.

When you read float values from consecutive memory addresses, you are relying heavily on the compiler to combine the reads for you. There is no guarantee that posX, posY, and posZ are in the same bank. Declaring it as float3 usually forces the locations of the component floats to fall within the same bank.

How the GPUs handle the vector computations varies between the vendors, but the memory accesses on both platforms will benefit from from the vectorization.




回答3:


I'm not terribly familiar with OpenCL, but in GLSL doing math with vectors is more efficient because the GPU can apply the same operation to all N components concurrently. Also, in GLSL vectors also support operations like dot products as built-in language features.



来源:https://stackoverflow.com/questions/8933604/float-vs-floatn

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!