Which bitset implementation should I use for maximum performance?

问题

I'm currently trying to implement various algorithms in a Just In Time (JIT) compiler. Many of the algorithms operate on bitmaps, more commonly known as bitsets.

In C++ there are various ways of implementing a bitset. As a true C++ developer, I would prefer to use something from the STL. The most important aspect is performance. I don't necessarily need a dynamically resizable bitset.

As I see it, there are three possible options.

I. One option would be to use std::vector<bool>, which has been optimized for space. This would also indicate that the data doesn't have to be contiguous in memory. I guess this could decrease performance. On the other hand, having one bit for each bool value could improve speed since it's very cache friendly.

II. Another option would be to instead use a std::vector<char>. It guarantees that the data is contiguous in memory and it's easier to access individual elements. However, it feels strange to use this option since it's not intended to be a bitset.

III. The third option would be to use the actual std::bitset. That fact that it's not dynamically resizable doesn't matter.

Which one should I choose for maximum performance?

回答1:

Best way is to just benchmark it, because every situation is different.

I wouldn't use std::vector<bool>. I tried it once and the performance was horrible. I could improve the performance of my application by simply using std::vector<char> instead.

I didn't really compare std::bitset with std::vector<char>, but if space is not a problem in your case, I would go for std::vector<char>. It uses 8 times more space than a bitset, but since it doesn't have to do bit-operations to get or set the data, it should be faster.

Of course if you need to store lots of data in the bitset/vector, then it could be beneficial to use bitset, because that would fit in the cache of the processor.

The easiest way is to use a typedef, and to hide the implementation. Both bitset and vector support the [] operator, so it should be easy to switch one implementation by the other.

回答2:

I answered a similar question recently in this forum. I recommend my BITSCAN library. I have just released version 1.0. BITSCAN is specifically designed for fast bit scanning operations.

A BitBoard class wraps a number of different implementations for typical operations such as bsf, bsr or popcount for 64-bit words (aka bitboards). Classes BitBoardN, BBIntrin and BBSentinel extend bit scanning to bit strings. A bit string in BITSCAN is an array of bitboards. The base wrapper class for a bit string is BitBoardN. BBIntrin extends BitBoardN by using Windows compiler intrinsics over 64 bitboards. BBIntrin is made portable to POSIX by using the appropriate asm equivalent functions.

I have used BITSCAN to implement a number of efficient solvers for NP combinatorial problems in the graph domain. Typically the adjacency matrix of the graph as well as vertex sets are encoded as bit strings and typical computations are performed using bit masks. Code for simple bitencoded graph objects is available in GRAPH. Examples of how to use BITSCAN and GRAPH are also available.

A comparison between BITSCAN and typical implementations in STL (bitset) and BOOST (dynamic_bitset) can be found here: http://blog.biicode.com/bitscan-efficiency-at-glance/

回答3:

You might also be interested in this (somewhat dated) paper: http://www.cs.up.ac.za/cs/vpieterse/pub/PieterseEtAl_SAICSIT2010.pdf

来源：https://stackoverflow.com/questions/11712479/which-bitset-implementation-should-i-use-for-maximum-performance

标签

c++

performance

compiler-construction

bitset