The C++11 standard defines a memory model (1.7, 1.10) which contains memory orderings, which are, roughly, \"sequentially-consistent\", \"acquire\", \"consume\", \"rele
Jeff Preshing has a great blog post answering this question. I can't add anything myself, but think anyone wondering about consume vs. acquire should read his post:
http://preshing.com/20140709/the-purpose-of-memory_order_consume-in-cpp11/
He shows a specific C++ example with corresponding benchmarked assembly code across three different architectures. Compared to memory_order_acquire, memory_order_consume potentially offers a 3x speedup on PowerPC, 1.6x speedup on ARM, and negligible speedup on x86 which has strong consistency anyway. The catch is that as of when he wrote it, only GCC actually treated consume semantics any differently from acquire, and probably because of a bug. Nonetheless, it demonstrates that a speedup is available if the compiler writers can figure out how to take advantage of it.