dealing with endianness in c++

后端 未结 5 1502
走了就别回头了
走了就别回头了 2020-12-17 05:44

I am working on translating a system from python to c++. I need to be able to perform actions in c++ that are generally performed by using Python\'s struct.unpack

相关标签:
5条回答
  • 2020-12-17 06:05

    If your as received values are truly strings, (char* or std::string) and you know their format information, sscanf(), and atoi(), well, really ato() will be your friends. They take well formatted strings and convert them per passed-in formats (kind of reverse printf).

    0 讨论(0)
  • 2020-12-17 06:06

    This falls in the realm of bit twiddling.

    for (i=0;i<sizeof(struct foo);i++) dst[i] = src[i ^ mask]; 
    

    where mask == (sizeof type -1) if the stored and native endianness differ.

    With this technique one can convert a struct to bit masks:

     struct foo {
        byte a,b;       //  mask = 0,0
        short e;        //  mask = 1,1
        int g;          //  mask = 3,3,3,3,
        double i;       //  mask = 7,7,7,7,7,7,7,7
     } s; // notice that all units must be aligned according their native size
    

    Again these masks can be encoded with two bits per symbol: (1<<n)-1, meaning that in 64-bit machines one can encode necessary masks of a 32 byte sized struct in a single constant (with 1,2,4 and 8 byte alignments).

    unsigned int mask = 0xffffaa50;  // or zero if the endianness matches
    for (i=0;i<16;i++) { 
         dst[i]=src[i ^ ((1<<(mask & 3))-1]; mask>>=2;
    }
    
    0 讨论(0)
  • 2020-12-17 06:14

    First, the cast you're doing:

    char *str = ...;
    int32_t i = *(int32_t*)str;
    

    results in undefined behavior due to the strict aliasing rule (unless str is initialized with something like int32_t x; char *str = (char*)&x;). In practical terms that cast can result in an unaligned read which causes a bus error (a crash) on some platforms and slow performance on others.

    Instead you should be doing something like:

    int32_t i;
    std::memcpy(&i, c, sizeof(i));
    

    There are a number of functions for swapping bytes between the host's native byte ordering and a host independent ordering: ntoh*(), hton*(), where * is nothing, l, or s for the different types supported. Since different hosts may have different byte orderings then this may be what you want to use if the data you're reading uses a consistent serialized form on all platforms.

    ntoh(i);
    

    You can also manually move bytes around in str before copying it into the integer.

    std::swap(str[0],str[3]);
    std::swap(str[1],str[2]);
    std::memcpy(&i,str,sizeof(i));
    

    Or you can manually manipulate the integer's value using shifts and bitwise operators.

    std::memcpy(&i,str,sizeof(i));
    i = (i&0xFFFF0000)>>16 | (i&0x0000FFFF)<<16;
    i = (i&0xFF00FF00)>>8  | (i&0x00FF00FF)<<8;
    
    0 讨论(0)
  • 2020-12-17 06:27

    For 32 and 16-bit values:

    This is exactly the problem you have for network data, which is big-endian. You can use the the ntohl to turn a 32-bit into host order, little-endian in your case.

    The ntohl() function converts the unsigned integer netlong from network byte order to host byte order.

    int res = ntohl(*((int32_t) str)));
    

    This will also take care of the case where your host is big-endian and won't do anything.

    For 64-bit values

    Non-standardly on linux/BSD you can take a look at 64 bit ntohl() in C++?, which points to htobe64

    These functions convert the byte encoding of integer values from the byte order that the current CPU (the "host") uses, to and from little-endian and big-endian byte order.

    For windows try: How do I convert between big-endian and little-endian values in C++?

    Which points to _byteswap_uint64 and as well as a 16 and 32-bit solution and a gcc-specific __builtin_bswap(32/64) call.

    Other Sizes

    Most systems don't have values that aren't 16/32/64 bits long. At that point I might try to store it in a 64-bit value, shift it and they translate. I'd write some good tests. I suspectt is an uncommon situation and more details would help.

    0 讨论(0)
  • 2020-12-17 06:30

    Unpack the string one byte at a time.

    unsigned char *str;
    unsigned int result;
    
    result =  *str++ << 24;
    result |= *str++ << 16;
    result |= *str++ << 8;
    result |= *str++;
    
    0 讨论(0)
提交回复
热议问题