CUDA: reduction or atomic operations?

后端未结

关注

 7  1474

眼角桃花 2021-01-14 19:00

I\'m writing a CUDA kernel which involves calculating the maximum value on a given matrix and I\'m evaluating possibilities. The best way I could find is:

Forcing ev

7条回答

耶瑟儿～ (楼主)

2021-01-14 19:19

If you have K20 or Titan, I suggest dynamic parallelism: lunching a single thread kernel, which lunches #items worker kernel threads to produce data, then lunches #items/first-round-reduction-factor threads for first round reduction, and keep lunching till result coming out.

0 讨论(0)

查看其它7个回答
发布评论:

提交评论
- 加载中...