CUDA streams not overlapping

社会主义新天地 提交于 2019-12-04 01:55:07

According to this post on the NVIDIA forums, the profiler will serialize streaming to get accurate timing data. If you think your timings are off, make sure you're using CUDA events...

I've been experimenting with streaming lately, and I found the "simpleMultiCopy" example from the SDK to be really helpful, particularly with the appropriate logic and synchronizations.

shadow

If you want to see the kernels overlap with kernels (concurrent kernels) you need to make use of CUDA Visual profiler 5.0 that comes with CUDA 5.0 Toolkit. I don't think previous profilers are capable of this. It should also show kernel and memcpy overlap.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!