I'm doing a reduction (finding the minimum and maximum) of a float[]
array on a GPU through OpenCL.
I'm loading the some elements from global
memory into local
memory for each workgroup. When the global size isn't a multiple of the workgroup size, I pad the global size, such that it becomes a multiple of the global size. Work-items past the end of the array put the neutral element of the reduction into local
memory.
But what should that neutral element be for max()
-- the maximum function?
The OpenCL documentation gives MAXFLOAT
, HUGE_VALF
and INFINITY
as very large positive (or unsigned) float
values.
Does it makes sense to have the neutral element to be -INFINITY
for example?
Right now I'm using HUGE_VALF
as the neutral element for min()
, but the docs also say that HUGE_VALF
is used as an error value, so maybe that's a bad idea.
Reduction kernel (Code):
#define NEUTRAL_ELEMENT HUGE_VALF
#define REDUCTION_OP min
__kernel void reduce(__global float* weights,
__local float* weights_cached
)
{
unsigned int id = get_global_id(0);
// Load data
if (id < {{ point_count }}) {
weights_cached[get_local_id(0)] = weights[id];
} else {
weights_cached[get_local_id(0)] = NEUTRAL_ELEMENT;
}
barrier(CLK_LOCAL_MEM_FENCE);
// Reduce
for(unsigned int stride = get_local_size(0) / 2; stride >= 1; stride /= 2) {
if (get_local_id(0) < stride) {
weights_cached[get_local_id(0)] = REDUCTION_OP(weights_cached[get_local_id(0)], weights_cached[get_local_id(0) + stride]);
barrier(CLK_LOCAL_MEM_FENCE);
}
// Save
weights[get_group_id(0)] = weights_cached[0];
}
Edit:
I actually ended up using fmin()
and fmax()
together with NAN
as the neutral element -- this is basically guaranteed to work according to the OpenCL documentation as the numerical value will always be returned (NAN
is only returned if two NAN
values are given).
Quoting the OpenCL standard:
HUGE_VALF evaluates to +infinity.
So there's no real difference (except for implied intent) between using HUGE_VALF
and INFINITY
; either will work correctly for a min
reduction. In terms of clarity, I have a slight preference for INFINITY
, as HUGE_VALF
is conceptually intended for edge-case returns, which this isn't.
Similarly, use -INFINITY
for a max
reduction.
MAX_FLOAT
will not behave correctly as a neutral element if your array contains infinities.
来源:https://stackoverflow.com/questions/14480225/neutral-element-for-min-and-max-in-opencl-reduction