Why does __sync_add_and_fetch work for a 64 bit variable on a 32 bit system?

前端 未结 2 759
深忆病人
深忆病人 2020-12-28 19:10

Consider the following condensed code:

/* Compile: gcc -pthread -m32 -ansi x.c */
#include 
#include <         


        
2条回答
  •  孤独总比滥情好
    2020-12-28 19:39

    The reading of the variable in 0x804855a and 0x804855f does not need to be atomic. Using the compare-and-swap instruction to increment looks like this in pseudocode:

    oldValue = *dest; // non-atomic: tearing between the halves is unlikely but possible
    do {
        newValue = oldValue+1;
    } while (!compare_and_swap(dest, &oldValue, newValue));
    

    Since the compare-and-swap checks that *dest == oldValue before swapping, it will act as a safeguard - so that if the value in oldValue is incorrect, the loop will be tried again, so there's no problem if the non-atomic read resulted in an incorrect value.

    The 64-bit access to *dest done by lock cmpxchg8b is atomic (as part of an atomic RMW of *dest). Any tearing in loading the 2 halves separately will be caught here. Or if a write from another core happened after the initial read, before lock cmpxchg8b: this is possible even with single-register-width cmpxchg-retry loops. (e.g. to implement atomic fetch_mul or an atomic float, or other RMW operations that x86's lock prefix doesn't let us do directly.)


    Your second question was why the line oldValue = *dest is not inside the loop. This is because the compare_and_swap function will always replace the value of oldValue with the actual value of *dest. So it will essentially perform the line oldValue = *dest for you, and there's no point in doing it again. In the case of the cmpxchg8b instruction, it will put the contents of the memory operand in edx:eax when the comparison fails.

    The pseudocode for compare_and_swap is:

    bool compare_and_swap (int *dest, int *oldVal, int newVal)
    {
      do atomically {
        if ( *oldVal == *dest ) {
            *dest = newVal;
            return true;
        } else {
            *oldVal = *dest;
            return false;
        }
      }
    }
    

    By the way, in your code you need to ensure that v is aligned to 64 bits - otherwise it could be split between two cache lines and the cmpxchg8b instruction will not be performed atomically. You can use GCC's __attribute__((aligned(8))) for this.

提交回复
热议问题