问题
I am writing a fragment shader in order to median 9 images together.
I have never worked with GLSL before, but it seemed like the right tool for the job, as OpenCL isn't available on iOS and medianing on the CPU is inefficient. Here's what I have so far:
uniform sampler2D frames[9];
uniform vec2 wh;
void main(void)
{
vec4 sortedFrameValues[9];
float sortedGrayScaleValues[9];
for (int i = 0; i < 9; i++)
{
sortedFrameValues[i] = texture2D(frames[i], -gl_FragCoord.xy / wh);
sortedGrayScaleValues[i] = dot(sortedFrameValues[i].xyz, vec3(0.299, 0.587, 0.114));
}
// TODO: Sort sortedGrayScaleValues
float gray = sortedGrayScaleValues[4];
gl_FragColor = vec4(gray, gray, gray, 0);
}
回答1:
A little late, but the fastest way I've found is insertion sort. Reducing shader complexity and divergence is key. Bitonic and bubble work pretty well too for small numbers. Once you get up around 100, switch to merge sort.
Since you know the number of things to sort (9) your best bet is a sort network. You could use this handy tool to generate it...
There are 27 comparators in this network,
grouped into 11 parallel operations.
[[0,1],[2,3],[4,5],[7,8]]
[[0,2],[1,3],[6,8]]
[[1,2],[6,7],[5,8]]
[[4,7],[3,8]]
[[4,6],[5,7]]
[[5,6],[2,7]]
[[0,5],[1,6],[3,7]]
[[0,4],[1,5],[3,6]]
[[1,4],[2,5]]
[[2,4],[3,5]]
[[3,4]]
A handy way to use this is declare a compare-and-swap macro...
#define CMP(a, b) ...
#define SWAP(a, b) ...
#define CSWAP(a, b) if (CMP(a, b)) {SWAP(a, b);}
CSWAP(0, 1); CSWAP(2, 3); ...
Combining both approaches, a sort network to quickly sort small blocks of data and then merge sort if you have many blocks works very well, as described in Fast Sorting for Exact OIT of Complex Scenes (disclaimer: I'm an author).
Unrolling loops (essentially creating a sort network) can be particularly beneficial, allowing sorting in registers. Dynamically indexed arrays are placed in local memory which is slow. To force the compiler not to do this, you could manually declare vec4 array0, array1 ...
. Macros can concatenate text which is useful here #define CMP(a, b) (array##a < array##b)
. A rather ugly but fast example is here.
回答2:
Well, I ended up implementing a bubble sort and using the middle value.
This is what my solution looks like:
uniform sampler2D frames[9];
uniform vec2 wh;
vec4 frameValues[9];
float arr[9];
void bubbleSort()
{
bool swapped = true;
int j = 0;
float tmp;
for (int c = 0; c < 3; c--)
{
if (!swapped)
break;
swapped = false;
j++;
for (int i = 0; i < 3; i++)
{
if (i >= 3 - j)
break;
if (arr[i] > arr[i + 1])
{
tmp = arr[i];
arr[i] = arr[i + 1];
arr[i + 1] = tmp;
swapped = true;
}
}
}
}
void main(void)
{
for (int i = 0; i < 9; i++)
{
frameValues[i] = texture2D(frames[i], -gl_FragCoord.xy / wh);
arr[i] = dot(frameValues[i].xyz, vec3(0.299, 0.587, 0.114));
}
bubbleSort();
float gray = arr[4];
gl_FragColor =vec4(gray, gray, gray, 0);
}
回答3:
This is just a normal sorting problem, is it not? The fastest way I know to find the median is the Median of Medians approach.
It may make more sense not to put your values into your "sorted" array until they're sorted.
You don't need the
sortedFrameValues
variable to be an array, at least as you're using it here - you never use any of the stored values again. You just need it as a single variable.
回答4:
You can use OpenGL ES in your iOS app to find the median pixel value in a source-pixel neighborhood radius of your choosing; it looks like this:
kernel vec4 medianUnsharpKernel(sampler u) {
vec4 pixel = unpremultiply(sample(u, samplerCoord(u)));
vec2 xy = destCoord();
int radius = 3;
int bounds = (radius - 1) / 2;
vec4 sum = vec4(0.0);
for (int i = (0 - bounds); i <= bounds; i++)
{
for (int j = (0 - bounds); j <= bounds; j++ )
{
sum += unpremultiply(sample(u, samplerTransform(u, vec2(xy + vec2(i, j)))));
}
}
vec4 mean = vec4(sum / vec4(pow(float(radius), 2.0)));
float mean_avg = float(mean);
float comp_avg = 0.0;
vec4 comp = vec4(0.0);
vec4 median = mean;
for (int i = (0 - bounds); i <= bounds; i++)
{
for (int j = (0 - bounds); j <= bounds; j++ )
{
comp = unpremultiply(sample(u, samplerTransform(u, vec2(xy + vec2(i, j)))));
comp_avg = float(comp);
median = (comp_avg < mean_avg) ? max(median, comp) : median;
}
}
return premultiply(vec4(vec3(abs(pixel.rgb - median.rgb)), 1.0));
}
Far less complicated, with no sorting required. It only involves two steps: 1. Calculate the mean of the values of the pixels surrounding the source pixel in a 3x3 neighborhood; 2. Find the maximum pixel value of all pixels in the same neighborhood that are less than the mean. 3. [OPTIONAL] Subtract the median pixel value from the source pixel value for edge detection.
If you're using the median value for edge detection, there are a couple of ways to modify the above code for better results, namely, hybrid median filtering and truncated media filtering (a substitute and a better 'mode' filtering). If you're interested, please ask.
来源:https://stackoverflow.com/questions/19440389/how-to-write-a-fragment-shader-in-glsl-to-sort-an-array-of-9-floating-point-numb