What do work items execute when conditionals are used in GPU programming?

空扰寡人 提交于 2019-12-03 03:23:40
talonmies

NVIDIA gpus use conditional execution to handle branch divergence within the SIMD group ("warp"). In your if..else example, both branches get executed by every thread in the diverging warp, but those threads which don't follow a given branch are flagged and perform a null op instead. This is the classic branch divergence penalty - interwarp branch divergence takes two passes through the code section to retire for warp. This isn't ideal, which is why performance oriented code tries to minimize this. One thing which often catches out people is making an assumption about which section of a divergent path gets executed "first". The have been some very subtle bugs cause by second guessing the internal order of execution within a divergent warp.

For simpler conditionals, NVIDIA GPUs support conditional evaluation at the ALU, which causes no divergence, and for conditionals where the whole warp follows the same path, there is also obviously no penalty.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!