CUDA streams not overlapping

后端 未结 2 1194
-上瘾入骨i
-上瘾入骨i 2021-02-20 16:16

I have something very similar to the code:

int k, no_streams = 4;
cudaStream_t stream[no_streams];
for(k = 0; k < no_streams; k++) cudaStreamCreate(&strea         


        
2条回答
  •  青春惊慌失措
    2021-02-20 17:12

    According to this post on the NVIDIA forums, the profiler will serialize streaming to get accurate timing data. If you think your timings are off, make sure you're using CUDA events...

    I've been experimenting with streaming lately, and I found the "simpleMultiCopy" example from the SDK to be really helpful, particularly with the appropriate logic and synchronizations.

提交回复
热议问题