OpenCL global memory vs. image memory performance differences on Nvidia and AMD

China☆狼群 提交于 2019-12-22 11:26:57

问题


The OpenCL benchmarking site http://www.clbenchmark.com/ has benchmarks for

Image Filter: Separable Gaussian Blur - Global Memory Usage and
Image Filter: Separable Gaussian Blur - Image Memory Usage

Nvidia complete dominates on the Global Memory Usage. For example the GTX 580 is nearlly twice as fast as the HD 7970. It's one of the few benchmarks where Nvidia still leads. Can someone explain why this is?

The reason I ask is that I have written a ray tracer on my GTX 590 which runs very fast. From most reviews I expected my ray tracer to run four times faster on a HD 7970. However, it actually runs four times slower! And I don't understand why. I don't use Image Buffers. I write out the pixels to global memory. When I profile the kernel time I see that the HD 7950 kernel time is four times slower so I know the problem is at the kernel side and not when moving data across the PCI bus.


回答1:


GLobal memory is the device memory, the data buffers which uses global memory have advantage that they can be read and write. they are slow, that is the access to the data buffers consume more gpu cycles.

On the other part texture memory or what you mean image memory is faster than the global memory, they uses less gpu cycles. But they can be read only or write only.

In case you have a situation where you want read only or write only you can use image buffers they are faster. But if you need read-write buffers you are forced to use data buffers(global memory).

Also one more thing to note is that, any read to image buffer can fetch 4 data at a time, if buffer declared RGBA. You can also use this advantage in data buffers if you use float4. Since gpu can access 4 float values in one fetch(this increases the performance).

Always try to use as global memory as less as possible(please do see the NVIDIA or AMD manuals to know exact number of cycles for each memory access). Please do let me know if you want more understanding :)



来源:https://stackoverflow.com/questions/15322206/opencl-global-memory-vs-image-memory-performance-differences-on-nvidia-and-amd

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!