CUDA: reduction or atomic operations?

后端 未结 7 1447
眼角桃花
眼角桃花 2021-01-14 19:00

I\'m writing a CUDA kernel which involves calculating the maximum value on a given matrix and I\'m evaluating possibilities. The best way I could find is:

Forcing ev

7条回答
  •  耶瑟儿~
    2021-01-14 19:19

    If you have K20 or Titan, I suggest dynamic parallelism: lunching a single thread kernel, which lunches #items worker kernel threads to produce data, then lunches #items/first-round-reduction-factor threads for first round reduction, and keep lunching till result coming out.

提交回复
热议问题