In CUDA, what is memory coalescing, and how is it achieved?

后端 未结 4 1579
予麋鹿
予麋鹿 2020-11-29 15:50

What is \"coalesced\" in CUDA global memory transaction? I couldn\'t understand even after going through my CUDA guide. How to do it? In CUDA programming guide matrix exampl

4条回答
  •  臣服心动
    2020-11-29 16:33

    If the threads in a block are accessing consecutive global memory locations, then all the accesses are combined into a single request(or coalesced) by the hardware. In the matrix example, matrix elements in row are arranged linearly, followed by the next row, and so on. For e.g 2x2 matrix and 2 threads in a block, memory locations are arranged as:

    (0,0) (0,1) (1,0) (1,1)

    In row access, thread1 accesses (0,0) and (1,0) which cannot be coalesced. In column access, thread1 accesses (0,0) and (0,1) which can be coalesced because they are adjacent.

提交回复
热议问题