问题
As I know. Constant memory on CUDA is a specific memory. And it is faster than global memory. But in OpenCL's Spec. I get the following words.
The
__constant
or constant address space name is used to describe variables allocated in global memory and which are accessed inside a kernel(s) as read-only variables
So the __constant
memory is from the __global
memory. Does that mean it have the same accessing performance with the __global
memory?
回答1:
It depends on the hardware and software architecture of the OpenCL platform you are using. For example, one can envision an architecture with read-only caches that don't need to participate in cache coherency. These caches could be used for constant memory but not global memory. So you might see faster accesses to constant memory.
That being said, none of the architectures I'm familiar with operate this way. So that's just hypothetical.
回答2:
The OpenCL standard does not specify how constant memory should be implemented, but in NVIDIA GPUs constant memory is cached. I don't know what AMD does.
来源:https://stackoverflow.com/questions/12153443/is-the-access-performance-of-constant-memory-as-same-as-global-memory-on-ope