Most efficient standard-compliant way of reinterpreting int as float

后端 未结 4 1769
不思量自难忘°
不思量自难忘° 2020-12-13 18:20

Assume I have guarantees that float is IEEE 754 binary32. Given a bit pattern that corresponds to a valid float, stored in std::uint32_t, how does

4条回答
  •  误落风尘
    2020-12-13 18:56

    Afaik, there are only two approaches that are compliant with strict aliasing rules: memcpy() and cast to char* with copying. All others read a float from memory that belongs to an uint32_t, and the compiler is allowed to perform the read before the write to that memory location. It might even optimize away the write altogether as it can prove that the stored value will never be used according to strict aliasing rules, resulting in a garbage return value.

    It really depends on the compiler/optimizes whether memcpy() or char* copy is faster. In both cases, an intelligent compiler might be able to figure out that it can just load and copy an uint32_t, but I would not trust any compiler to do so before I have seen it in the resulting assembler code.

    Edit:
    After some testing with gcc 4.8.1, I can say that the memcpy() approach is the best for this particulare compiler, see below for details.


    Compiling

    #include 
    
    float foo(uint32_t a) {
        float b;
        char* aPointer = (char*)&a, *bPointer = (char*)&b;
        for( int i = sizeof(a); i--; ) bPointer[i] = aPointer[i];
        return b;
    }
    

    with gcc -S -std=gnu11 -O3 foo.c yields this assemble code:

    movl    %edi, %ecx
    movl    %edi, %edx
    movl    %edi, %eax
    shrl    $24, %ecx
    shrl    $16, %edx
    shrw    $8, %ax
    movb    %cl, -1(%rsp)
    movb    %dl, -2(%rsp)
    movb    %al, -3(%rsp)
    movb    %dil, -4(%rsp)
    movss   -4(%rsp), %xmm0
    ret
    

    This is not optimal.

    Doing the same with

    #include 
    #include 
    
    float foo(uint32_t a) {
        float b;
        char* aPointer = (char*)&a, *bPointer = (char*)&b;
        memcpy(bPointer, aPointer, sizeof(a));
        return b;
    }
    

    yields (with all optimization levels except -O0):

    movl    %edi, -4(%rsp)
    movss   -4(%rsp), %xmm0
    ret
    

    This is optimal.

提交回复
热议问题