Once again: strict aliasing rule and char*

前端 未结 2 596
情深已故
情深已故 2020-12-28 18:41

The more I read, the more confused I get.

The last question from the related ones is closest to my question, but I got confused with all words about object lifetime

相关标签:
2条回答
  • 2020-12-28 19:18

    How is the second one different from the first one, especially when we're talking about reordering instructions (for optimization)?

    The problem is in the compiler using the rules to determine whether such an optimization is allowed. In the second case you're trying to read a char[] object via an incompatible pointer type, which is undefined behavior; hence, the compiler might re-order the read and write (or do anything else which you might not expect).

    As unnatural as it might seem, you really have to stop thinking about how you think the compiler might optimize, and just obey the rules.

    Or this is just a straight rule, which clearly states: "this can be done in the one direction, but not in the other"? I couldn't find anything relevant in the standards (searched for this especially in C++11 standard).

    http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf chapter 3.10 paragraph 10.

    In C99, and I think also C11, it's 6.5 paragraph 7.

    Both C and C++ allow accessing any object type via char * (or specifically, an lvalue of type char). They do not allow accessing a char object via an arbitrary type. So yes, the rule is a "one way" rule.

    I used union to "workaround" this, which still appears to be NOT 100% OK, as it's not guaranteed by the standard (which states, that I can only rely on the value, which is last modified in the union).

    Although the wording of the standard is horribly ambiguous, in C99 (and beyond) it's clear (at least since C99 TC3) that the intent is to allow type-punning through a union. You must however perform all accesses through the union (in particular you cannot just 'cast a union into existence' for purpose of type-punning).

    the returned value is in char[ 4 ]. Then I need to convert this to uint32_t

    Just use memcpy or manually shift the bytes to the correct position, in case byte-ordering is an issue. Good compilers can optimize this out anyway (yes, even the call to memcpy).

    0 讨论(0)
  • 2020-12-28 19:18

    I used union to "workaround" this, which still appears to be NOT 100% OK, as it's not guaranteed by the standard (which states, that I can only rely on the value, which is last modified in the union).

    Endianess is the reason for this. Specifically the sequence of bytes 01 00 00 00 could mean 1 or 16,777,216.

    The correct way to do what you are doing is to stop trying to trick the compiler into doing the conversion for you and perform the conversion yourself.

    For instance if the char[4] is little-endian (smallest byte first) then you would do something like the following.

    char[] buff = new char[4];
    uint32_t result = 0;
    for (int i = 0; i < 4; i++)
        result = (result << 8) + buff[i];
    

    This manually performs the conversion between the two and is guaranteed to always be correct as you are doing the mathematical conversion.

    Now if you were doing this conversion rapidly it might make sense to use #if and knowledge of your architecture to use a enum to do this automatically as you mentioned, but that is again getting away from portable solutions. (Also you can use something like this as your fallback if you can't be certain)

    0 讨论(0)
提交回复
热议问题