Fastest Method to Split a 32 Bit number into Bytes in C++

后端 未结 7 1265
梦谈多话
梦谈多话 2021-01-21 03:15

I am writing a piece of code designed to do some data compression on CLSID structures. I\'m storing them as a compressed stream of 128 bit integers. However, the code in questio

7条回答
  •  没有蜡笔的小新
    2021-01-21 03:41

    compressedBytes.push_back(either.bytes.b[0]);
    compressedBytes.push_back(either.bytes.b[1]);
    compressedBytes.push_back(either.bytes.b[2]);
    compressedBytes.push_back(either.bytes.b[3]);
    

    There is an even smarter and faster way! Let's see what this code is doing and how we can improve it.

    This code is serializing the integer, one byte at a time. For each byte it's calling push_back, which is checking the free space in the internal vector buffer. If we have no room for another byte, memory reallocation will happen (hint, slow!). Granted, the reallocation will not happen frequently (reallocations typically happen by doubling the existing buffer). Then, the new byte is copied and the internal size is increased by one.

    vector<> has a requirement by the standard which dictates that the internal buffer be contiguous. vector<> also happen to have an operator& () and operator[] ().

    So, here is the best code you can come up with:

    std::string invalidClsids("This is a test string");
    std::vector compressedBytes;
    DWORD invalidLength = (DWORD) invalidClsids.length();
    compressedBytes.resize(sizeof(DWORD)); // You probably want to make this much larger, to avoid resizing later.
    // compressedBytes is as large as the length we want to serialize.
    BYTE* p = &compressedBytes[0]; // This is valid code and designed by the standard for such cases. p points to a buffer that is at least as large as a DWORD.
    *((DWORD*)p) = invalidLength;  // Copy all bytes in one go!
    

    The above cast can be done in one go with the &compressedBytes[0] statement, but it won't be faster. This is more readable.

    NOTE! Serializing this way (or even with the UNION method) is endian-dependent. That is, on an Intel/AMD processor the least significant byte will come first, while one a big-endian machine (PowerPC, Motorola...) the most significant byte will come first. If you want to be neutral, you must use a math method (shifts).

提交回复
热议问题