I\'m writing a program with a consumer thread and a producer thread, now it seems queue synchronization is a big overhead in the program, and I looked for some lock free que
The algorithm you show manages to work because although the two threads share the resource (i.e., the queue), they share it in a very particular way. Because only one thread ever alters the head-index of the queue (the producer), and only one thread every alters the tail-index (consumer, of course), you can't get an inconsistent state of the shared object. It's also important that the producer put the actual data in before updating the head index, and that the consumer reads the data it wants before updating the tail index.
It works as well as it does b/c the array is quite static; both threads can count on the storage for the elements being there. You probably can't replace the array entirely, but what you can do is change what the array is used for.
I.e., instead of keeping the data in the array, use it to keep pointers to the data. Then you can malloc() and free() the data items, while passing references (pointers) to them between your threads via the array.
Also, posix does support reading a nanosecond clock, although the actual precision is system dependent. You can read this high resolution clock before and after and just subtract.