Is function call a memory barrier?

后端 未结 5 1989
情深已故
情深已故 2020-12-24 13:52

Consider this C code:

extern volatile int hardware_reg;

void f(const void *src, size_t len)
{
    void *dst = ;

    hardware_reg = 1;    
         


        
相关标签:
5条回答
  • 2020-12-24 14:18

    The compiler cannot reorder the memcpy() operation before the hardware_reg = 1 or after the hardware_reg = 0 - that's what volatile will ensure - at least as far as the instruction stream the compiler emits. A function call is not necessarily a 'memory barrier', but it is a sequence point.

    The C99 standard says this about volatile (5.1.2.3/5 "Program execution"):

    At sequence points, volatile objects are stable in the sense that previous accesses are complete and subsequent accesses have not yet occurred.

    So at the sequence point represented by the memcpy(), the volatile access of writing 1 has to occurred, and the volatile access of writing 0 cannot have occurred.

    However, there are 2 things I'd like to point out:

    1. Depending on what <something> is, if nothing else is done with the the destination buffer, the compiler might be able to completely remove the memcpy() operation. This is the reason Microsoft came up with the SecureZeroMemory() function. SecureZeroMemory() operates on volatile qualified pointers to prevent optimizing writes away.

    2. volatile doesn't necessarily imply a memory barrier (which is a hardware thing, not just a code ordering thing), so if you're running on a multi-proc machine or certain types of hardware you may need to explicitly invoke a memory barrier (perhaps wmb() on Linux).

      Starting with MSVC 8 (VS 2005), Microsoft documents that the volatile keyword implies the appropriate memory barrier, so a separate specific memory barrier call may not be necessary:

      • http://msdn.microsoft.com/en-us/library/12a04hfd.aspx

      Also, when optimizing, the compiler must maintain ordering among references to volatile objects as well as references to other global objects. In particular,

      • A write to a volatile object (volatile write) has Release semantics; a reference to a global or static object that occurs before a write to a volatile object in the instruction sequence will occur before that volatile write in the compiled binary.

      • A read of a volatile object (volatile read) has Acquire semantics; a reference to a global or static object that occurs after a read of volatile memory in the instruction sequence will occur after that volatile read in the compiled binary.

    0 讨论(0)
  • 2020-12-24 14:28

    It's probalby going to get optimized, either because the compiler inlines the mecpy call and eliminates the first assignment, or because it gets compiled to RISC code or machine code and gets optimized there.

    0 讨论(0)
  • 2020-12-24 14:34

    My assumption would be that the compiler never re-orders volatile assignments since it has to assume they must be executed at exactly the position where they occur in the code.

    0 讨论(0)
  • 2020-12-24 14:35

    As far as I can see your reasoning leading to

    the compiler would see no trouble in moving the memcpy call

    is correct. Your question is not answered by the language definition, and can only be addressed with reference to specific compilers.

    Sorry to not have any more-useful information.

    0 讨论(0)
  • 2020-12-24 14:35

    Here is a slightly modified example, compiled with gcc 7.2.1 on x86-64:

    #include <string.h>
    static int temp;
    extern volatile int hardware_reg;
    int foo (int x)
    {
        hardware_reg = 0;
        memcpy(&temp, &x, sizeof(int));
        hardware_reg = 1;
        return temp;
    }
    

    gcc knows that the memcpy() is the same as an assignment, and knows that temp is not accessed anywhere else, so temp and the memcpy() disappear completely from the generated code:

    foo:
        movl    $0, hardware_reg(%rip)
        movl    %edi, %eax
        movl    $1, hardware_reg(%rip)
        ret
    
    0 讨论(0)
提交回复
热议问题