问题
Here is a program foo.c that writes data to shared memory.
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <stdint.h>
#include <unistd.h>
#include <sys/ipc.h>
#include <sys/shm.h>
int main()
{
    key_t key;
    int shmid;
    char *mem;
    if ((key = ftok("ftok", 0)) == -1) {
        perror("ftok");
        return 1;
    }
    if ((shmid = shmget(key, 100, 0600 | IPC_CREAT)) == -1) {
        perror("shmget");
        return 1;
    }
    printf("key: 0x%x; shmid: %d\n", key, shmid);
    if ((mem = shmat(shmid, NULL, 0)) == (void *) -1) {
        perror("shmat");
        return 1;
    }
    sprintf(mem, "hello");
    sleep(10);
    sprintf(mem, "exit");
    return 1;
}
Here is another program bar.c that reads data from the same shared memory.
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <stdint.h>
#include <unistd.h>
#include <sys/ipc.h>
#include <sys/shm.h>
int main()
{
    key_t key;
    int shmid;
    volatile char *mem;
    if ((key = ftok("ftok", 0)) == -1) {
        perror("ftok");
        return 1;
    }
    if ((shmid = shmget(key, sizeof (int), 0400 | IPC_CREAT)) == -1) {
        perror("shmget");
        return 1;
    }
    printf("key: 0x%x; shmid: %d\n", key, shmid);
    if ((mem = shmat(shmid, NULL, 0)) == (void *) -1) {
        perror("shmat");
        return 1;
    }
    printf("looping ...\n");
    while (strncmp((char *) mem, "exit", 4) != 0)
        ;
    printf("exiting ...\n");
    return 0;
}
I run the writer program first in one terminal.
touch ftok && gcc foo.c -o foo && ./foo
While the writer program is still running, I run the reader program in another terminal.
gcc -O1 bar.c -o bar && ./bar
The reader program goes into an infinite loop. It looks like the optimizer has optimized the following code
    while (strncmp((char *) mem, "exit", 4) != 0)
        ;
to
    while (1)
        ;
because it sees nothing in the loop that could modify the data at mem after it has been read once.
But I declared mem as volatile precisely for this reason; to prevent the compiler from optimizing it away.
volatile char *mem;
Why does the compiler still optimize away the reads for mem?
By the way, I have found a solution that works. The solution that works is to modify
    while (strncmp((char *) mem, "exit", 4) != 0)
        ;
to
    while (mem[0] != 'e' || mem[1] != 'x' || mem[2] != 'i' || mem[3] != 't')
        ;
Why is it that the compiler optimizes away strncmp((char *) mem, "exit", 4) != 0 but does not optimize away mem[0] != 'e' || mem[1] != 'x' || mem[2] != 'i' || mem[3] != 't' even though char *mem is declared to be volatile in both cases?
回答1:
By writing (char *)mem you are telling the strncmp function that it it is actually not a volatile buffer. And indeed, strncmp and the other C library functions are not designed to work on volatile buffers. 
You do in fact need to modify your code to not use C library functions on volatile buffers. Your options include:
- Write your own alternative to the C library function that works with volatile buffers.
- Use a proper memory barrier.
You've gone with the first option; but think about what would happen if the other process modified the memory in between your four reads.  To avoid this sort of problem you'd need to use the second option, an inter-process memory barrier -- in which case the buffer no longer needs to be volatile and you can go back to using the C library functions. (The compiler must assume that the barrier check might change the buffer).
回答2:
6.7.3 Type qualifiers
6 [...] If an attempt is made to refer to an object defined with a volatile-qualified type through use of an lvalue with non-volatile-qualified type, the behavior is undefined.133)
133) This applies to those objects that behave as if they were defined with qualified types, even if they are never actually defined as objects in the program (such as an object at a memory-mapped input/output address).
That is exactly what you observe in your code. The compiler is basically optimizing your code under the wild freedom of "the behavior is undefined anyway".
In other words, it is impossible to correctly apply strncmp directly to volatile data.
What you can do is either implement your own comparison that does not discard volatile qualifier (which is what you've done already), or use some volatile-aware method of copying volatile data to non-volatile storage and they apply strncmp to the latter.
来源:https://stackoverflow.com/questions/41051724/why-does-the-compiler-optimize-away-shared-memory-reads-due-to-strncmp-even-if