The example is unclear for two reasons:
- It is too simple to fully show what's happening with the fences.
- Albahari is including requirements for non-x86 architectures. See MSDN: "MemoryBarrier is required only on multiprocessor systems with weak memory ordering (for example, a system employing multiple Intel Itanium processors [which Microsoft no longer supports]).".
If you consider the following, it becomes clearer:
- A memory barrier (full barriers here - .Net doesn't provide a half barrier) prevents read / write instructions from jumping the fence (due to various optimisations). This guarantees us the code after the fence will execute after the code before the fence.
- "This serializing operation guarantees that every load and store instruction that precedes in program order the MFENCE instruction is globally visible before any load or store instruction that follows the MFENCE instruction is globally visible." See here.
- x86 CPUs have a strong memory model and guarantee writes appear consistent to all threads / cores (therefore barriers #2 & #3 are unneeded on x86). But, we are not guaranteed that reads and writes will remain in coded sequence, hence the need for barriers #1 and #4.
- Memory barriers are inefficient and needn't be used (see the same MSDN article). I personally use Interlocked and volatile (make sure you know how to use it correctly!!), which work efficiently and are easy to understand.
Ps. This article explains the inner workings of x86 nicely.