CUDA: In warp reduction and volatile keyword

前端 未结 1 1841
日久生厌
日久生厌 2020-12-11 09:29

After reading the question and its answer from the following
LINK

I still have a question remaining in my mind. From my background in C/C++; I understand that us

相关标签:
1条回答
  • 2020-12-11 10:06

    Removing the volatile keyword from that code could break that code on Fermi and Kepler GPUS. Those GPUs lack instructions to directly operate on shared memory. Instead, the compiler must emit a load/store pair to and from register.

    What the volatile keyword does in this context is make the compiler honour that load-operate-store cycle and not perform an optimisation that would keep the value of s_data[tid] in register. To keep the sum accumulating in register would break the implicit memory syncronisation required to make that warp level shared memory summation work correctly.

    0 讨论(0)
提交回复
热议问题