2D CUDA median filter optimization

后端 未结 4 874
清酒与你
清酒与你 2021-01-01 04:12

I have implemented a 2D median filter in CUDA and the whole program is shown below.

#include \"cuda_runtime.h\"
#include \"cuda_runtime_api.h\"
#include \"de         


        
4条回答
  •  夕颜
    夕颜 (楼主)
    2021-01-01 04:32

    Quickselect median is the fastest linear time algorithm for best case; however, it is hard to implement in CUDA due to memory overhead. The easiest approach for a highly parallel algorithm is to minimize memory overhead. Instead of a partial sort algorithm like Quickselect, do away with memory overhead entirely by using the Torben median algorithm. The Torben algorithm can be significantly slower than other algorithms, but it does not modify the input data. Therefore, there is no need to allocate shared memory.

    Finally, for maximum speed, bind the input to a texture, which has the added bonus of managing border extensions. To minimize cache misses, use a row major nested for loop for row major array.

提交回复
热议问题