How to find number of 1's in a binary number in O(1) time?

女生的网名这么多〃 提交于 2019-12-05 02:01:42

问题


I know this has been asked before, but I'm looking at this particular solution listed here:

int BitCount(unsigned int u)
{
     unsigned int uCount;

     uCount = u - ((u >> 1) & 033333333333) - ((u >> 2) & 011111111111);
     return ((uCount + (uCount >> 3)) & 030707070707) % 63;
}

How does it work?

Are there any caveats involved here?

Theoretically is it possible to find the answer in constant time? I mean don't we actually have to iterate through the bits to count?


回答1:


Counting bits

An unsigned 32-bit integer u can be written like this:

u = a31 * 231 + a30 * 230 + ... + a0 * 20

We want the value of a31 + a30 + ... + a0.

Let's compare the values of u >> k:

u >> 0  = a31 * 231 + a30 * 230 + ... + a1 * 21 + a0 * 20
u >> 1  = a31 * 230 + a30 * 229 + ... + a1 * 20
u >> 2  = a31 * 229 + a30 * 228 + ...
...
u >> 29 = a31 * 22  + a29 * 21  + ...
u >> 30 = a31 * 21  + a30 * 20 
u >> 31 = a31 * 20 

We will calculate the bit population by this formula:

u >> 0 - u >> 1 - u >> 2 - ... - u >> 31 = p

Let's see why this works:

  u >> 0 - u >> 1 - u >> 2 - ... - u >> 31
= u >> 0 - (u >> 1 + u >> 2 + ... + u >> 31)
= u - q

What's the value of q? Let's calculate it bit by bit, looking at the values for u >> k above. For a31, it's:

  a31 * 230 + a31 * 229 + ...
= a31 * (230 + 229 + ...)
= a31 * (231 - 1)

Or for a30:

  a30 * 229 + a30 * 228 + ...
= a30 * (229 + 228 + ...)
= a30 * (230 - 1)

We find: q = a31 * (231 - 1) + a30 * (230 - 1) + ...

And thus

u - q = a31 * 231 - a31 * (231 - 1) + ...
      = a31 + a30 + ... + a0

Counting bits in 3-bit blocks

This algorithm starts by doing the same thing, but in blocks of 3 bits:

u >> 0                = AaBbbCccDddEeeFffGggHhhIiiJjjKkk (each letter is a bit)
u >> 1 & 033333333333 =  A Bb Cc Dd Ee Ff Gg Hh Ii Jj Kk (blank = zero)
u >> 2 & 011111111111 =     B  C  D  E  F  G  H  I  J  K

Taking the difference of these, by the above algorithm, each octet in uCount contains the number of bits set in the corresponding octet in u.

uCount      =   αβγδεζηθικλ (each greek letter is an octet)
uCount >> 3 =    αβγδεζηθικ

So uCount + (uCount >> 3) is (λ+κ) * 20 + (κ+ι) * 23 + (ι+θ) * 26 + ...

By ANDing with 0o30707070707, we mask away every other octet, so that we only count each pair once:

r = (λ+κ) *  20 + (ι+θ) *  26 + (η+ζ) *  212 + ...
  = (λ+κ) * 640 + (ι+θ) * 641 + (η+ζ) * 642  + ...

This is a base-64 number, and we want to sum up the base-64 digits to get α+β+γ+δ+ε+ζ+η+θ+ι+κ+λ, our final result. To do this, we calculate its base-64 digital root: knowing that the result can never be bigger than 32, we just modulo the number by 63.




回答2:


Iterating the bits is constant time since the number of bits in a type is constant.

So a solution that checks a mask of one bit and shifts for each bit in the target value is indeed O(1) (e.g. where the constant is 32).




回答3:


The fastest way to do this is popcnt instruction. You can access it through compiler intrinsic usually. Your solution can be useful on platforms that lack this instruction.




回答4:


Counting bits set, in parallel shows how this is done. That method can be used for 8-, 16-, 32-, 64-, 128-, etc. bit words, although the constants used in the calculations change.

When we say that this operation is O(1), we mean that it can be done in constant time regardless of the word size. Counting the bits the naive way is O(n) in the number of bits.

Speaking practically, this is only O(1) when the processor can work with the word size natively.

As for how it works, it uses "magic numbers." See this newsgroup post for an explanation.



来源:https://stackoverflow.com/questions/19729466/how-to-find-number-of-1s-in-a-binary-number-in-o1-time

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!