Finding max value in CUDA

不打扰是莪最后的温柔 提交于 2019-12-06 02:16:25

This is a purely a reduction problem. Here's a good presentation by NVIDIA for optimizing reduction on GPUs. You can use the same technique to either find the minimum, maximum or sum of all elements.

The link for Thrust library is broken.
If anyone finds it useful to use it in this case, you can find the documentation here:
Thrust, extrema reductions

I looked for the same answer but found most are too formidable to a newbie like me. Here is my example code to find the max. Please let me know if this is used properly.

__global__
void find_max(int max_x, int max_y, float *tot, float *x, float *y)
{
    int i = blockIdx.x*blockDim.x + threadIdx.x;
    int j = blockIdx.y*blockDim.y + threadIdx.y;
    if(i < max_x && j<max_y) {
        if(*tot < x[i])
            atomicExch(tot, x[i]);
    }
}

Unless you're trying to write a reduction kernel, the simplest way is simply to use the CUBLAS.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!