CUDA: reduction or atomic operations?

后端未结

关注

 7  1479

眼角桃花 2021-01-14 19:00

I\'m writing a CUDA kernel which involves calculating the maximum value on a given matrix and I\'m evaluating possibilities. The best way I could find is:

Forcing ev

7条回答

暗喜 (楼主)

2021-01-14 19:10

You may also want to use the reduction routines that comes w/ CUDA Thrust which is a part of CUDA 4.0 or available here.

The library is written by a pair of nVidia engineers and compares favorably with heavily hand optimized code. I believe there is also some auto-tuning of grid/block size going on.

You can interface with your own kernel easily by wrapping your raw device pointers.

This is strictly from a rapid integration point of view. For the theory, see tkerwin's answer.

0 讨论(0)

查看其它7个回答
发布评论:

提交评论
- 加载中...