Is coalescing triggered for accessing memory in reverse order?

后端 未结 2 1118
夕颜
夕颜 2021-01-14 12:05

Let\'s say I have several threads and they access memory at addresses A+0, A+4, A+8, A+12 (each access = next thread). Such access is coalesced, right?

However if I

2条回答
  •  [愿得一人]
    2021-01-14 12:27

    It's also worth noting that a main purpose of the L2 cache in an Nvidia GPU is to collapse reads and coalesce writes. So if one warp was accessing

    thread 0 -> A+0
    thread 1 -> A+8
    thread 2 -> A+16
    thread 3 -> A+24
    ...
    

    and another warp was accessing

    thread 0 -> A+4
    thread 1 -> A+12
    thread 2 -> A+20
    thread 3 -> A+28
    ...
    

    these two accesses will not coalesce inside the SM but generally will coalesce in the L2 cache, so that GPU memory will only be touched once.

提交回复
热议问题