Implement a high performance mutex similar to Qt's one
问题 I have a multi-thread scientific application where several computing threads (one per core) have to store their results in a common buffer. This requires a mutex mechanism. Working threads spend only a small fraction of their time writing to the buffer, so the mutex is unlocked most of the time, and locks have a high probability to succeed immediately without waiting for another thread to unlock. Currently, I have used Qt's QMutex for the task, and it works well : the mutex has a negligible