reinterpret_cast, char*, and undefined behavior

☆樱花仙子☆ 提交于 2019-12-18 13:06:48

问题


What are the cases where reinterpret_casting a char* (or char[N]) is undefined behavior, and when is it defined behavior? What is the rule of thumb I should be using to answer this question?


As we learned from this question, the following is undefined behavior:

alignas(int) char data[sizeof(int)];
int *myInt = new (data) int;           // OK
*myInt = 34;                           // OK
int i = *reinterpret_cast<int*>(data); // <== UB! have to use std::launder

But at what point can we do a reinterpret_cast on a char array and have it NOT be undefined behavior? Here are a few simple examples:

  1. No new, just reinterpret_cast:

    alignas(int) char data[sizeof(int)];
    *reinterpret_cast<int*>(data) = 42;    // is the first cast write UB?
    int i = *reinterpret_cast<int*>(data); // how about a read?
    *reinterpret_cast<int*>(data) = 4;     // how about the second write?
    int j = *reinterpret_cast<int*>(data); // or the second read?
    

    When does the lifetime for the int start? Is it with the declaration of data? If so, when does the lifetime of data end?

  2. What if data were a pointer?

    char* data_ptr = new char[sizeof(int)];
    *reinterpret_cast<int*>(data_ptr) = 4;     // is this UB?
    int i = *reinterpret_cast<int*>(data_ptr); // how about the read?
    
  3. What if I'm just receiving structs on the wire and want to conditionally cast them based on what the first byte is?

    // bunch of handle functions that do stuff with the members of these types
    void handle(MsgType1 const& );
    void handle(MsgTypeF const& );
    
    char buffer[100]; 
    ::recv(some_socket, buffer, 100)
    
    switch (buffer[0]) {
    case '1':
        handle(*reinterpret_cast<MsgType1*>(buffer)); // is this UB?
        break;
    case 'F':
        handle(*reinterpret_cast<MsgTypeF*>(buffer));
        break;
    // ...
    }
    

Are any of these cases UB? Are all of them? Does the answer to this question change between C++11 to C++1z?


回答1:


There are two rules at play here:

  1. [basic.lval]/8, aka, the strict aliasing rule: simply put, you can't access an object through a pointer/reference to the wrong type.

  2. [base.life]/8: simply put, if you reuse storage for an object of a different type, you can't use pointers to the old object(s) without laundering them first.

These rules are an important part of making a distinction between "a memory location" or "a region of storage" and "an object".

All of your code examples fall prey to the same problem: they're not the object you cast them to:

alignas(int) char data[sizeof(int)];

That creates an object of type char[sizeof(int)]. That object is not an int. Therefore, you may not access it as if it were. It doesn't matter if it is a read or a write; you still provoke UB.

Similarly:

char* data_ptr = new char[sizeof(int)];

That also creates an object of type char[sizeof(int)].

char buffer[100];

This creates an object of type char[100]. That object is neither a MsgType1 nor a MsgTypeF. So you cannot access it as if it were either.

Note that the UB here is when you access the buffer as one of the Msg* types, not when you check the first byte. If all your Msg* types are trivially copyable, it's perfectly acceptable to read the first byte, then copy the buffer into an object of the appropriate type.

switch (buffer[0]) {
case '1':
    {
        MsgType1 msg;
        memcpy(&msg, buffer, sizeof(MsgType1);
        handle(msg);
    }
    break;
case 'F':
    {
        MsgTypeF msg;
        memcpy(&msg, buffer, sizeof(MsgTypeF);
        handle(msg);
    }
    break;
// ...
}

Note that we're talking about what the language states will be undefined behavior. Odds are good that the compiler would be just fine with any of these.

Does the answer to this question change between C++11 to C++1z?

There have been some significant rule clarifications since C++11 (particularly [basic.life]). But the intent behind the rules hasn't changed.



来源:https://stackoverflow.com/questions/39429476/reinterpret-cast-char-and-undefined-behavior

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!