can we get the on-time print-out during the kernel running?

霸气de小男生 提交于 2021-02-19 04:35:09

问题


I realized that "cuPrintf" can be used in the kernel, but "cudaPrintfDisplay" can only be used in the CPU code. This seems to me that the "cuPrintf" can only be flushed to stdout after returning from kernel. My question is: can we get the on-time print-out during the kernel running?


回答1:


As you have noticed, cuPrintf() (and printf() in compute capability >= 2.0), simply add strings to a buffer while the kernel is running, and the buffer is printed out after the kernel ends.

I don't think there is a way to get real time printf from a kernel. But, to get less delay, you may be able to run the kernel with fewer threads each time. Since __device__ printf() is only a diagnostics or debugging tool, any loss in performance shouldn't matter.

Maybe the best thing would be to run your code in a CUDA debugger and get immediate feedback that way.



来源:https://stackoverflow.com/questions/12589380/can-we-get-the-on-time-print-out-during-the-kernel-running

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!