Dynamic parallelism - launching many small kernels is very slow

╄→гoц情女王★ 提交于 2019-12-03 21:50:29

There is a cost in launching kernels, either parent or child. If your child kernels do not extract much parallelism and there is not much benefit against their non-parallel counterparts, then your faint benefit may be cancelled out by the child kernel launch overheads.

In formulas, let to be the overhead to execute a child kernel, te its execution time and ts the time to execute the same code without the help of dynamic parallelism. The speedup arising from the use of dynamic parallelism is ts/(to+te). Perhaps (but this cannot be envinced from your code) te<ts but te,ts<<to, so that ts/(to+te) is about (ts/to)<1 and you observe a slowdown instead of a speedup.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!