Aliasing through unions

偶尔善良 提交于 2021-01-27 14:16:51

问题


The 6.5(p7) has a bullet about unions and aggregates:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

[...]

— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or

This is not quite clear what it means. Does it require at least one member or all members to satisfy the strict aliasing rule. Particularly about unions:

union aliased{
    unsigned char uint64_repr[sizeof(uint64_t)];
    uint64_t value;
};

int main(int args, const char *argv[]){
    uint64_t some_random_value = 123;
    union aliased alias;
    memcpy(&(alias.uint64_repr), &some_random_value, sizeof(uint64_t));
    printf("Value = %" PRIu64 "\n", alias.value);
}

DEMO

Is the behavior of the program well-defined? If no, what does the bullet mean?


回答1:


What is means is using a union is one of the standard compliant ways to avoid type punning and the strict aliasing violation that would otherwise occur if you attempted to access a stored value through a pointer of a different type.

Take for example unsigned and float, generally both 32-bits and in certain cases looking at the stored value from either unsigned* or float* may be needed. You cannot for example do:

    float f = 3.3;
    // unsigned u = *(unsigned *)&f;  /* violation */

Following 6.5(p7) you can use a union between both types and access the same information as either unsigned or float without type-punning a pointer or running afoul of the strict aliasing rule, e.g.

typedef union {
    float f;
    unsigned u;
} f2u;
...    
    float f = 3.3;
    // unsigned u = *(unsigned *)&f;  /* violation */
    f2u fu = { .f = f };
    unsigned u = fu.u;                /* OK - no violation */

So the strict aliasing rule prevents accessing memory with an effective-type through a pointer of another type, unless that pointer is char type or a pointer to a member of a union between the two types.

(note: that section of the standard is one that is anything but an example of clarity. (you can read it 10 times and still scratch your head) Its intent is to curb the abuse of pointer types, while still recognizing that a block of memory in any form must be capable of being accessed through a character type, (and a union is among the other allowable manners of access).)

Compilers have gotten much better in the past few years at flagging violations of the rule.




回答2:


The bullet point serves two purposes. First of all, if one recognizes that an access to an lvalue which is, or might be, visibly based upon an lvalue of a particular type should be recognized as an lvalue, or possible lvalue, of the latter type, then given something like:

union U {int x[10]; float y[10];} u;

an lvalue which is visibly derived from u would be allowed to access all of the objects contained therein. The range of situations in which an implementation would recognize that an lvalue is based upon another is a quality-of-implementation issue, with some quality compilers like icc being able to recognize, given something like:

int load_array_element(int *array, int i) { return array[i]); }
...
int test(int i) { return load_array_element(&u.x, i); }

that anything that particular call to load_array_element might do with *array would done with u (it is being given an address of an lvalue directly formed from u, after all), and other compilers like clang and gcc being unable to recognize even a construct like *(u.x+i) as an lvalue based on u.

A second purpose of the bullet is to suggest that even if a compiler is too primitive to keep track of lvalue derivation in straight-line code, it should recognize that given declarations:

int *p,i;
struct foo { int x;} foo;

if it sees *p=1; i=foo.x; without having paid any attention to where p came from, it must ensure that the write to *p is performed before the read of foo.x. Even if that should only really be necessary in cases where a compiler which had been bothering to pay attention would have been able to see that p had been formed from foo, describing things in those terms would have increased apparent compiler complexity compared with making the access to foo.x force the completion of any pending writes to the targets of integer pointers.

Note that if one is interested only in cases where a struct or union member is accessed via freshly-derived pointer, there's no need to include a general permission to access the struct or union object via lvalue of member type. Given the code sequence: foo.x = 1; p = &foo.x; i=*p;, the act of taking the address of foo.x should cause the compiler to complete any pending writes to foo.x before running any code that might use the address (a compiler that has no idea what downstream code would do with the address could simply complete the write immediately). If the code sequence were foo.x = 1; i = *p;, the act of accessing foo.x via the lvalue foo would mean that any existing pointer that might identify that storage would be "stale", and thus a compiler would be under no obligation to recognize that such a pointer might identify the same storage as foo.x.

Note that despite footnote 88 which clearly says that the purpose of the "strict aliasing rule" is to specify when objects are allowed to alias, the interpretation by gcc and clang interpret the rule as an excuse to ignore cases in which objects are accessed by lvalues which are quite visibly derived from them. Perhaps in retrospect the authors of the Standard should have included a provision "Note that this rule makes no attempt to forbid low-quality compilers from behaving in obtuse fashion, but is not intended to invite such behavior" but the authors of C89 had no reason to expect that the rule would be interpreted as it has, and the authors of clang and gcc would almost certainly veto any suggestion to add such language now.



来源:https://stackoverflow.com/questions/55254998/aliasing-through-unions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!