Reading “integer” size bytes from a char* array.

青春壹個敷衍的年華 提交于 2019-11-28 07:38:16

Do you mean something like that?:

char* a;
int i;
memcpy(&i, a, sizeof(i));

You only have to worry about endianess if the source of the data is from a different platform, like a device.

a) You only need to worry about "endianness" (i.e., byte-swapping) if the data was created on a big-endian machine and is being processed on a little-endian machine, or vice versa. There are many ways this can occur, but here are a couple of examples.

  1. You receive data on a Windows machine via a socket. Windows employs a little-endian architecture while network data is "supposed" to be in big-endian format.
  2. You process a data file that was created on a system with a different "endianness."

In either of these cases, you'll need to byte-swap all numbers that are bigger than 1 byte, e.g., shorts, ints, longs, doubles, etc. However, if you are always dealing with data from the same platform, endian issues are of no concern.

b) Based on your question, it sounds like you have a char pointer and want to extract the first 4 bytes as an int and then deal with any endian issues. To do the extraction, use this:

int n = *(reinterpret_cast<int *>(myArray)); // where myArray is your data

Obviously, this assumes myArray is not a null pointer; otherwise, this will crash since it dereferences the pointer, so employ a good defensive programming scheme.

To swap the bytes on Windows, you can use the ntohs()/ntohl() and/or htons()/htonl() functions defined in winsock2.h. Or you can write some simple routines to do this in C++, for example:

inline unsigned short swap_16bit(unsigned short us)
{
    return (unsigned short)(((us & 0xFF00) >> 8) |
                            ((us & 0x00FF) << 8));
}

inline unsigned long swap_32bit(unsigned long ul)
{
    return (unsigned long)(((ul & 0xFF000000) >> 24) |
                           ((ul & 0x00FF0000) >>  8) |
                           ((ul & 0x0000FF00) <<  8) |
                           ((ul & 0x000000FF) << 24));
}

Depends on how you want to read them, I get the feeling you want to cast 4 bytes into an integer, doing so over network streamed data will usually end up in something like this:

int foo = *(int*)(stream+offset_in_stream);

The easy way to solve this is to make sure whatever generates the bytes does so in a consistent endianness. Typically the "network byte order" used by various TCP/IP stuff is best: the library routines htonl and ntohl work very well with this, and they are usually fairly well optimized.

However, if network byte order is not being used, you may need to do things in other ways. You need to know two things: the size of an integer, and the byte order. Once you know that, you know how many bytes to extract and in which order to put them together into an int.

Some example code that assumes sizeof(int) is the right number of bytes:

#include <limits.h>

int bytes_to_int_big_endian(const char *bytes)
{
    int i;
    int result;

    result = 0;
    for (i = 0; i < sizeof(int); ++i)
        result = (result << CHAR_BIT) + bytes[i];
    return result;
}

int bytes_to_int_little_endian(const char *bytes)
{
    int i;
    int result;

    result = 0;
    for (i = 0; i < sizeof(int); ++i)
        result += bytes[i] << (i * CHAR_BIT);
    return result;
}


#ifdef TEST

#include <stdio.h>

int main(void)
{
    const int correct = 0x01020304;
    const char little[] = "\x04\x03\x02\x01";
    const char big[] = "\x01\x02\x03\x04";

    printf("correct: %0x\n", correct);
    printf("from big-endian: %0x\n", bytes_to_int_big_endian(big));
    printf("from-little-endian: %0x\n", bytes_to_int_little_endian(little));
    return 0;
}

#endif

How about

int int_from_bytes(const char * bytes, _Bool reverse)
{
    if(!reverse)
        return *(int *)(void *)bytes;

    char tmp[sizeof(int)];

    for(size_t i = sizeof(tmp); i--; ++bytes)
        tmp[i] = *bytes;

    return *(int *)(void *)tmp;
}

You'd use it like this:

int i = int_from_bytes(bytes, SYSTEM_ENDIANNESS != ARRAY_ENDIANNESS);

If you're on a system where casting void * to int * may result in alignment conflicts, you can use

int int_from_bytes(const char * bytes, _Bool reverse)
{
    int tmp;

    if(reverse)
    {
        for(size_t i = sizeof(tmp); i--; ++bytes)
            ((char *)&tmp)[i] = *bytes;
    }
    else memcpy(&tmp, bytes, sizeof(tmp));

    return tmp;
}

You shouldn't need to worry about endianess unless you are reading the bytes from a source created on a different machine, e.g. a network stream.

Given that, can't you just use a for loop?

void ReadBytes(char * stream) {
    for (int i = 0; i < sizeof(int); i++) {
        char foo = stream[i];
        }
    }
 }

Are you asking for something more complicated than that?

You need to worry about endianess only if the data you're reading is composed of numbers which are larger than one byte.
if you're reading sizeof(int) bytes and expect to interpret them as an int then endianess makes a difference. essentially endianness is the way in which a machine interprets a series of more than 1 bytes into a numerical value.

Just use a for loop that moves over the array in sizeof(int) chunks.
Use the function ntohl (found in the header <arpa/inet.h>, at least on Linux) to convert from bytes in the network order (network order is defined as big-endian) to local byte-order. That library function is implemented to perform the correct network-to-host conversion for whatever processor you're running on.

Why read when you can just compare?

bool AreEqual(int i, char *data)
{
   return memcmp(&i, data, sizeof(int)) == 0;
}

If you are worrying about endianness when you need to convert all of integers to some invariant form. htonl and ntohl are good examples.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!