What is a good sorting algorithm on CUDA?

后端 未结 4 951
南笙
南笙 2021-02-04 13:09

I have an array of struct and I need to sort this array according to a property of the struct (N). The object looks like this:

 struct OBJ
 { 
   int N; //sort a         


        
4条回答
  •  天涯浪人
    2021-02-04 13:31

    What means "big" and "small" ?

    By "big" I assume you mean something of >1M elements, while small --- small enough to actually fit in shared memory (probably <1K elements). If my understanding of "small" matches yours, I would try the following:

    • Use only a single block to sort the array (it can be then a part of some bigger CUDA kernel)
    • Bitonic sort is one of good appraches which can be adopted for parallel algorithm.

    Some pages on bitonic sort:

    • Bitonic sort (nice explanation, applet to visualise and java source which does not take too much space)
    • Wikipedia (a bit too short explanation for my taste, but more source codes - some abstract language and Java)
    • NVIDIA code Samples (A sample source in CUDA. I think it is a bit ovefocused on killing bank conflicts. I believe the simpler code may actually perform faster)

    I once also implemented a bubble sort (lol!) for a single warp to sort arrays of 32 elements. Thanks to its simplicity it did not perform that bad actually. A well tuned bitonic sort will still perform faster though.

提交回复
热议问题