How to use shared memory between kernel call of CUDA?

问题

I want to use shared memory between kernel call of one kernel. Can I use shared memory between kernel call?

回答1:

No, you can't. Shared memory has thread block life-cycle. A variable stored in it can be accessible by all the threads belonging to one group during one __global__ function invocation.

回答2:

Take a try of page-locked memory, but the speed should be much slower than graphic memory. cudaHostAlloc (void **ptr, size_t size, cudaHostAllocMapped); then send the ptr to the kernel code.

回答3:

Previously you could do it in a non-standard way where you would have a unique id for each shared memory block and the next kernel would check the id and therefore carry out required processing on this shared memory block. This was hard to implement as you needed to ensure full occupancy for each kernel and deal with various corner cases. In addition, without formal support you coulf not rely on compatibility across compute device and cuda versions.

来源：https://stackoverflow.com/questions/10609591/how-to-use-shared-memory-between-kernel-call-of-cuda

标签

cuda

shared-memory

gpgpu

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!