I have a setup where I need to lock, read some data, process, write some data, and then unlock. To this end, I made a locking texture as a layout(r32ui) coherent unif
I have standardized the locking texture to be img0.
Thread warps have a shared program counter. If a single thread grabs the lock, the other threads in the warp will still be stuck in the loop. In practice, this compiles but results in a deadlock.
Examples: StackOverflow, OpenGL.org
while (imageAtomicExchange(img0,coord,1u)==1u);
//
memoryBarrier();
imageAtomicExchange(img0,coord,0);
To work around the issue of type 1, one instead writes conditionally. In the below, I have sometimes written the loop as a do-while loop, but a while loop doesn't work correctly either.
The first thing one tries is a simple loop. Apparently due to buggy optimizations, this can result in a crash (I haven't tried recently).
Example: NVIDIA
bool have_written = false;
while (true) {
bool can_write = (imageAtomicExchange(img0,coord,1u)!=1u);
if (can_write) {
//
memoryBarrier();
imageAtomicExchange(img0,coord,0);
break;
}
}
The above example uses imageAtomicExchange(...), which might not be the first thing one tries. The most intuitive is imageAtomicCompSwap(...). Unfortunately, this doesn't work due to buggy optimizations. It (should be) otherwise sound.
Example: StackOverflow
bool have_written = false;
do {
bool can_write = (imageAtomicCompSwap(img0,coord,0u,1u)==0u);
if (can_write) {
//
memoryBarrier();
imageAtomicExchange(img0,coord,0);
have_written = true;
}
} while (!have_written);
Switching back from imageAtomicCompSwap(...) to imageAtomicExchange(...) is the other common variant. The difference with 2.1 is the way the loop is terminated. This doesn't work correctly for me.
Examples: StackOverflow, StackOverflow
bool have_written = false;
do {
bool can_write = (imageAtomicExchange(img0,coord,1u)!=1u);
if (can_write) {
//
memoryBarrier();
imageAtomicExchange(img0,coord,0);
have_written = true;
}
} while (!have_written);