Atomic max for floats in OpenCL

不打扰是莪最后的温柔 提交于 2019-12-23 23:54:04

问题


I need an atomic max function for floats in OpenCL. This is my current naive code using atomic_xchg

float value = data[index];
if ( value  > *max_value )
{
    atomic_xchg(max_value, value);
}

This code gives the correct result when using an Intel CPU, but not for a Nvidia GPU. Is this code correct, or can anyone help me?


回答1:


You can do it like this:

 //Function to perform the atomic max
 inline void AtomicMax(volatile __global float *source, const float operand) {
    union {
        unsigned int intVal;
        float floatVal;
    } newVal;
    union {
        unsigned int intVal;
        float floatVal;
    } prevVal;
    do {
        prevVal.floatVal = *source;
        newVal.floatVal = max(prevVal.floatVal,operand);
    } while (atomic_cmpxchg((volatile __global unsigned int *)source, prevVal.intVal, newVal.intVal) != prevVal.intVal);
}

__kernel mykern(__global float *data, __global float *max_value){
    unsigned int index = get_global_id(0);

    float value = data[index];
    AtomicMax(max_value, value);
}

As stated in LINK.

What it does is create a union of float and int. Perform the math on the float, but compare integers when doing the atomic xchg. As long as the integers match, the operation is completed.

However, the speed decrease due to the use of these methods is very high. Use them carefully.



来源:https://stackoverflow.com/questions/18950732/atomic-max-for-floats-in-opencl

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!