Apparently this can be done quickly with "vertical counters". From the now-defunct page on Bit tricks (archive) by @steike:
Consider a normal array of integers, where we read the bits
horizontally:
msb<-->lsb
x[0] 00000010 = 2
x[1] 00000001 = 1
x[2] 00000101 = 5
A vertical counter stores the numbers, as the name implies,
vertically; that is, a k-bit counter is stored across k words, with a
single bit in each word.
x[0] 00000110 lsb ↑
x[1] 00000001 |
x[2] 00000100 |
x[3] 00000000 |
x[4] 00000000 msb ↓
512
With the numbers stored like this, we can use bitwise operations to
increment any subset of them all at once.
We create a bitmap with a 1 bit in the positions corresponding to the
counters we want to increment, and loop through the array from LSB up,
updating the bits as we go. The "carries" from one addition becomes
the input for the next element of the array.
input sum
--------------------------------------------------------------------------------
A B C S
0 0 0 0
0 1 0 1 sum = a ^ b
1 0 0 1 carry = a & b
1 1 1 1
carry = input;
long *p = buffer;
while (carry) {
a = *p; b = carry;
*p++ = a ^ b;
carry = a & b;
}
For 64-bit words the loop will run 6-7 times on average -- the number of iterations is determined by the longest chain of carries.