2D CUDA median filter optimization

后端 未结 4 881
清酒与你
清酒与你 2021-01-01 04:12

I have implemented a 2D median filter in CUDA and the whole program is shown below.

#include \"cuda_runtime.h\"
#include \"cuda_runtime_api.h\"
#include \"de         


        
4条回答
  •  北荒
    北荒 (楼主)
    2021-01-01 04:25

    It seems you share nothing between threads using shared memory, i.e. for 3x3 filter, you read each pixel 9 times from the global memory, which is not necessary. This white paper may provide some ideas on how to using shared memory in a convolution kernel. Hope it help.

    http://docs.nvidia.com/cuda/samples/3_Imaging/convolutionSeparable/doc/convolutionSeparable.pdf

提交回复
热议问题