Is using an union in place of a cast well defined?

余生长醉 提交于 2019-12-17 12:47:35

问题


I had a discussion this morning with a colleague regarding the correctness of a "coding trick" to detect endianness.

The trick was:

bool is_big_endian()
{
  union
  {
    int i;
    char c[sizeof(int)];
  } foo;


  foo.i = 1;
  return (foo.c[0] == 1);
}

To me, it seems that this usage of an union is incorrect because setting one member of the union and reading another is not well-defined. But I have to admit that this is just a feeling and I lack actual proofs to strengthen my point.

Is this trick correct ? Who is right here ?


回答1:


Your code is not portable. It might work on some compilers or it might not.

You are right about the behaviour being undefined when you try to access the inactive member of the union [as it is in the case of the code given]

$9.5/1

In a union, at most one of the data members can be active at any time, that is, the value of at most one of the data members can be stored in a union at any time.

So foo.c[0] == 1 is incorrect because c is not active at that moment. Feel free to correct me if you think I am wrong.




回答2:


Don't do this, better use something like the following:

#include <arpa/inet.h>
//#include <winsock2.h> // <-- for Windows use this instead

#include <stdint.h>

bool is_big_endian() {
  uint32_t i = 1;
  return i == htonl(i);
}

Explanation:

The htonl function converts a u_long from host to TCP/IP network byte order (which is big-endian).


References:

  • http://linux.die.net/man/3/htonl
  • http://msdn.microsoft.com/de-de/library/ms738556%28v=vs.85%29.aspx



回答3:


You're correct that that code doesn't have well-defined behavior. Here's how to do it portably:

#include <cstring>

bool is_big_endian()
{
    static unsigned const i = 1u;
    char c[sizeof(unsigned)] = { };
    std::memcpy(c, &i, sizeof(c));
    return !c[0];
}

// or, alternatively

bool is_big_endian()
{
    static unsigned const i = 1u;
    return !*static_cast<char const*>(static_cast<void const*>(&i));
}



回答4:


The function should be named is_little_endian. I think you can use this union trick. Or also a cast to char.




回答5:


The code has undefined behavior, although some (most?) compilers will define it, at least in limited cases.

The intent of the standard is that reinterpret_cast be used for this. This intent isn't well expressed, however, since the standard can't really define the behavior; there is no desire to define it when the hardware won't support it (e.g. because of alignment issues). And it's also clear that you can't just reinterpret_cast between two arbitrary types and expect it to work.

From a quality of implementation point of view, I would expect both the union trick and reinterpret_cast to work, if the union or the reinterpret_cast is in the same functional block; the union should work as long as the compiler can see that the ultimate type is a union (although I've used compilers where this wasn't the case).



来源:https://stackoverflow.com/questions/6136010/is-using-an-union-in-place-of-a-cast-well-defined

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!