Do the c++11 strict alias rules allow accessing uint64_t via char *, char(&)[N],even std::array<char, N>& with -fstrict-aliasing -Wstrict-aliasing=2?

人盡茶涼 提交于 2019-12-05 15:53:53
xskxzr

The char(&)[N] case and std::array<char, N> case both result in undefined behavior. The reason has already been block-quoted by you. Note neither char(&)[N] nor std::array<char, N> is the same type as char.

I am not sure of the char case, because the current standard does not explicitly say that an object can be viewed as an array of narrow characters (see here for further discussion).

Anyway, if you want to access the underlying bytes of an object, use std::memcpy, as the standards explicitly says in [basic.types]/2:

For any object (other than a base-class subobject) of trivially copyable type T, whether or not the object holds a valid value of type T, the underlying bytes ([intro.memory]) making up the object can be copied into an array of char, unsigned char, or std​::​byte ([cstddef.syn]). If the content of that array is copied back into the object, the object shall subsequently hold its original value. [ Example:

#define N sizeof(T)
char buf[N];
T obj;                          // obj initialized to its original value
std::memcpy(buf, &obj, N);      // between these two calls to std​::​memcpy, obj might be modified
std::memcpy(&obj, buf, N);      // at this point, each subobject of obj of scalar type holds its original value

— end example ]

The strict aliasing rule is in fact very simple: Two objects with overlapping lifetime cannot have overlapping storage region if one is not a suboject of the other.(*)

Nevertheless, it is allowed to read the memory representation of an object. The memory representation of an object is a sequence of unsigned char [basic.types]/4:

The object representation of an object of type T is the sequence of N unsigned char objects taken up by the object of type T, where N equals sizeof(T). The value representation of an object is the set of bits that hold the value of type T.

Accordingly in your example:

  • lam(str1) is UB (Undefined Behavior);
  • lam(str2) is UB (an array and its first element are not pointer interconvertible);
  • lam(str3) is not stated as UB in the standard, if you replace char by unsigned char one could argue that you are reading the object representation. (it is not defined either, but it should work on all compilers)

So using the third case and changing the declaration of p to const unsigned char* should always produce the expected result. For the other 2 cases, it can work with this simple example, but may break if the code is more complicated or on newer compiler version.


(*) There are two exception to this rule: one for unions' members with common initialization sequence; and one for array of unsigned char or std::byte that provides storage for an other object.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!