Raymond Chen has been doing a huge series on lockfree algorithms. Beyond the simple cases of the InterlockedXxx functions, it seems like the prevailing pattern
Lock-free also has the advantage that it does not sleep. There are places in kernels where you are not permitted to sleep - the Windows kernel has a bunch of them - and that painfully restricts your ability to use data structures.