Which is the most efficient way to extract an arbitrary range of bits from a contiguous sequence of words?
Suppose we have an std::vector , or any other sequence container (sometimes it will be a deque), which store uint64_t elements. Now, let's see this vector as a sequence of size() * 64 contiguous bits. I need to find the word formed by the bits in a given [begin, end) range, given that end - begin <= 64 so it fits in a word. The solution I have right now finds the two words whose parts will form the result, and separately masks and combines them. Since I need this to be as efficient as possible, I've tried to code everything without any if branch to not cause branch mispredictions, so for