Any smarter way to extract from array of bits?

天涯浪子 提交于 2019-12-06 22:40:24

问题


I have areas of memory that could be considered "array of bits". They are equivalent to

unsigned char arr[256];

But it would be better thought of as

bit arr[2048];

I'm accessing separate bits from it with

#define GETBIT(x,in)   ((in)[ ((x)/8) ] & 1<<(7-((x)%8)))

but I do it a lot in many places of the code, often in performance-critical sections and I wonder if there are any smarter, more optimal methods to do it.

extra info: Architecture: ARM9 (32 bit); gcc/Linux. The physical data representation can't be changed - it is externally provided or mapped for external use.


回答1:


For randomly accessing individual bits, the macro you've suggested is as good as you're going to get (as long as you turn on optimisations in your compiler).

If there is any pattern at all to the bits you're accessing, then you may be able to do better. For example, if you often access pairs of bits, then you may see some improvement by providing a method to get two bits instead of one, even if you don't always end up using both bits.

As with any optimisation problem, you will need to be very familiar with the behaviour of your code, in particular its access patterns in your bit array, to make a meaningful improvement in performance.

Update: Since you access ranges of bits, you can probably squeeze some more performance out of your macros. For example, if you need to access four bits you might have macros like this:

#define GETBITS_0_4(x,in) (((in)[(x)/8] & 0x0f))
#define GETBITS_1_4(x,in) (((in)[(x)/8] & 0x1e) >> 1)
#define GETBITS_2_4(x,in) (((in)[(x)/8] & 0x3c) >> 2)
#define GETBITS_3_4(x,in) (((in)[(x)/8] & 0x78) >> 3)
#define GETBITS_4_4(x,in) (((in)[(x)/8] & 0xf0) >> 4)
#define GETBITS_5_4(x,in) ((((in)[(x)/8] & 0xe0) >> 5) | (((in)[(x)/8+1] & 0x01)) << 3)
#define GETBITS_6_4(x,in) ((((in)[(x)/8] & 0xc0) >> 6) | (((in)[(x)/8+1] & 0x03)) << 2)
#define GETBITS_7_4(x,in) ((((in)[(x)/8] & 0x80) >> 7) | (((in)[(x)/8+1] & 0x07)) << 1)
// ...etc

These macros would clip out four bits from each bit position 0, 1, 2, etc. (To cut down on the proliferation of pointless parentheses, you might want to use inline functions for the above.) Then perhaps define an inline function like:

inline int GETBITS_4(int x, unsigned char *in) {
    switch (x % 8) {
        case 0: return GETBITS_0_4(x,in);
        case 1: return GETBITS_1_4(x,in);
        case 2: return GETBITS_2_4(x,in);
        // ...etc
    }
}

Since this is a lot of tedious boilerplate code, especially if you've got multiple different widths, you may want to write a program to generate all the GETBIT_* accessor functions.

(I notice that the bits in your bytes are stored in the reverse order from what I've written above. Apply an appropriate transformation to match your structure if you need to.)




回答2:


I don't think so. In fact, many CPU architectures won't access bits individually.

On C++ you have std::bitset<N>. but may not have highest-performance depending on your compiler's implementation and optimization.

BTW, it may be better to group your bit-array as uint32_t[32] (or uint64_t[16]) for aligned dereferencing (which bitset does this for you already).




回答3:


Taking Greg's solution as a basis:

template<unsigned int n, unsigned int m> 
inline unsigned long getbits(unsigned long[] bits) {
  const unsigned bitsPerLong = sizeof(unsigned long) * CHAR_BIT
  const unsigned int bitsToGet = m - n;
  BOOST_STATIC_ASSERT(bitsToGet < bitsPerLong);
  const unsigned mask = (1UL << bitsToGet) - 1;
  const size_t index0 = n / bitsPerLong;
  const size_t index1 = m / bitsPerLong;
  // Do the bits to extract straddle a boundary?
  if (index0 == index1) {
    return (bits[index0] >> (n % bitsPerLong)) & mask;
  } else {
    return ((bits[index0] >> (n % bitsPerLong)) + (bits[index1] << (bitsPerLong - (m % bitsPerLong)))) & mask;
  }
}

Can get at least 32 bits, even if they are not aligned. Note that's intentionally inline as you don't want to have tons of these functions.




回答4:


If You reverse the bit order in 'arr', then You can eliminate the substraction from the macro. It is the best what i can say, without knowledge of the problem context (how the bits are used).




回答5:


#define GETBIT(x,in)   ((in)[ ((x)/8) ] & 1<<(7-((x)%8)))

can be optimized.

1) Use standard int which is normally the fastest accessible integer datatype. If you don't need to be portable, you can find out the size of an int with sizeof and adapt the following code.

2)

#define GETBIT(x,in)   ((in)[ ((x) >>> 3) ] & 1<<((x) & 7))

The mod operator % is slower than ANDing. And you don't need to subtract, simply adjust your SETBIT routine.




回答6:


Why not create your own wrapper class?

You could then add bits to the "array" using an operator such as + and get back the individual bits using the [] operator.

Your macro could be improved by using & 7 instead of % 8 but its likely the compiler will make that optimisation for you anyway.

I recently did exactly what you are doing and my stream could consist of any number of bits.

So I have something like the following:

BitStream< 1 > oneBitBitStream;
BitStream< 2 > twoBitBitStream;

oneBitBitStream += Bit_One;
oneBitBitStream += Bit_Zero;

twoBitBitStream += Bit_Three;
twoBitBitStream += Bit_One;

and so on. It makes for nice readable code and you can provide an STL like interface to it for aiding faimilarity :)




回答7:


Since the question is tagged with C++, is there any reason you can't simply use the standard bitset?




回答8:


Instead of the unsigned char array and custom macros, you can use std::vector<bool>. The vector class template has a special template specialization for the bool type. This specialization is provided to optimize for space allocation: In this template specialization, each element occupies only one bit (which is eight times less than the smallest type in C++: char).



来源:https://stackoverflow.com/questions/2160720/any-smarter-way-to-extract-from-array-of-bits

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!