How CudaMalloc work?

前端 未结 3 830
忘了有多久
忘了有多久 2020-12-11 08:52

I am trying to modify the imageDenosing class in CUDA SDK, I need to repeat the filter many time incase to capture the time. But my code doesn\'t work properly.

//st

3条回答
  •  南笙
    南笙 (楼主)
    2020-12-11 09:32

    Your kernel is running asynchronously - you need to wait for it to complete, e.g.

    cudaMalloc((void **)&dst2, size);
    cudaMemcpy(dst2, dst, imageW * imageH * sizeof(TColor), cudaMemcpyHostToDevice);
    F1D<<>>(dst, imageW, imageH, dst2);
    cudaThreadSynchronize(); // *** wait for kernel to complete ***
    cudaFree(dst2);
    

提交回复
热议问题