How to do an integer log2() in C++?

In the C++ standard libraries I found only a floating point log method. Now I use log to find the level of an index in a binary tree ( floor(2log(index)) ).

Code (C++):

int targetlevel = int(log(index)/log(2));

I am afraid that for some of the edge elements (the elements with value 2^n) log will return n-1.999999999999 instead of n.0. Is this fear correct? How can I modify my statement so that it always will return a correct answer?

You can use this method instead:

int targetlevel = 0;
while (index >>= 1) ++targetlevel;

Note: this will modify index. If you need it unchanged, create another temporary int.

The corner case is when index is 0. You probably should check it separately and throw an exception or return an error if index == 0.

If you are on a recent-ish x86 or x86-64 platform (and you probably are), use the bsr instruction which will return the position of the highest set bit in an unsigned integer. It turns out that this is exactly the same as log2(). Here is a short C or C++ function that invokes bsr using inline ASM:

#include <stdint.h>
static inline uint32_t log2(const uint32_t x) {
  uint32_t y;
  asm ( "\tbsr %1, %0\n"
      : "=r"(y)
      : "r" (x)
  );
  return y;
}

If you just want a fast integer log₂ operation, the following function mylog2() will do it without having to worry about floating-point accuracy:

#include <limits.h>

static unsigned int mylog2 (unsigned int val) {
    if (val == 0) return UINT_MAX;
    if (val == 1) return 0;
    unsigned int ret = 0;
    while (val > 1) {
        val >>= 1;
        ret++;
    }
    return ret;
}

#include <stdio.h>

int main (void) {
    for (unsigned int i = 0; i < 20; i++)
        printf ("%u -> %u\n", i, mylog2(i));
    putchar ('\n');
    for (unsigned int i = 0; i < 10; i++)
        printf ("%u -> %u\n", i+UINT_MAX-9, mylog2(i+UINT_MAX-9));
    return 0;
}

The code above also has a small test harness so you can check the behaviour:

0 -> 4294967295
1 -> 0
2 -> 1
3 -> 1
4 -> 2
5 -> 2
6 -> 2
7 -> 2
8 -> 3
9 -> 3
10 -> 3
11 -> 3
12 -> 3
13 -> 3
14 -> 3
15 -> 3
16 -> 4
17 -> 4
18 -> 4
19 -> 4

4294967286 -> 31
4294967287 -> 31
4294967288 -> 31
4294967289 -> 31
4294967290 -> 31
4294967291 -> 31
4294967292 -> 31
4294967293 -> 31
4294967294 -> 31
4294967295 -> 31

It will return UINT_MAX for an input value of 0 as an indication of an undefined result, so that's something you should check for (no valid unsigned integer will have a logarithm that high).

By the way, there are some insanely fast hacks to do exactly this (find the highest bit set in a 2's complement number) available from here. I wouldn't suggest using them unless speed is of the essence (I prefer readability myself) but you should be made aware that they exist.

Base-2 Integer Logarithm

Here is what I do for 64-bit unsigned integers. This calculates the floor of the base-2 logarithm, which is equivalent to the index of the most significant bit. This method is smokingly fast for large numbers because it uses an unrolled loop that executes always in log₂64 = 6 steps.

Essentially, what it does is subtracts away progressively smaller squares in the sequence { 0 ≤ k ≤ 5: 2^(2^k) } = { 2³², 2¹⁶, 2⁸, 2⁴, 2², 2¹ } = { 4294967296, 65536, 256, 16, 4, 2, 1 } and sums the exponents k of the subtracted values.

int uint64_log2(uint64_t n)
{
  #define S(k) if (n >= (UINT64_C(1) << k)) { i += k; n >>= k; }

  int i = -(n == 0); S(32); S(16); S(8); S(4); S(2); S(1); return i;

  #undef S
}

Note that this returns –1 if given the invalid input of 0 (which is what the initial -(n == 0) is checking for). If you never expect to invoke it with n == 0, you could substitute int i = 0; for the initializer and add assert(n != 0); at entry to the function.

Base-10 Integer Logarithm

Base-10 integer logarithms can be calculated using similarly — with the largest square to test being 10¹⁶ because log₁₀2⁶⁴ ≅ 19.2659...

int uint64_log10(uint64_t n)
{
  #define S(k, m) if (n >= UINT64_C(m)) { i += k; n /= UINT64_C(m); }

  int i = -(n == 0);
  S(16,10000000000000000); S(8,100000000); S(4,10000); S(2,100); S(1,10);
  return i;

  #undef S
}

This has been proposed in the comments above. Using gcc builtins:

static inline int log2i(int x) {
    assert(x > 0);

    return sizeof(int) * 8 - __builtin_clz(x) - 1;
}

static void test_log2i(void) {
    assert_se(log2i(1) == 0);
    assert_se(log2i(2) == 1);
    assert_se(log2i(3) == 1);
    assert_se(log2i(4) == 2);
    assert_se(log2i(32) == 5);
    assert_se(log2i(33) == 5);
    assert_se(log2i(63) == 5);
    assert_se(log2i(INT_MAX) == sizeof(int)*8-2);
}

I've never had any problem with floating-point accuracy on the formula you're using (and a quick check of numbers from 1 to 2³¹ - 1 found no errors), but if you're worried, you can use this function instead, which returns the same results and is about 66% faster in my tests:

int HighestBit(int i){
    if(i == 0)
        return -1;

    int bit = 31;
    if((i & 0xFFFFFF00) == 0){
        i <<= 24;
        bit = 7;
    }else if((i & 0xFFFF0000) == 0){
        i <<= 16;
        bit = 15;
    }else if((i & 0xFF000000) == 0){
        i <<= 8;
        bit = 23;
    }

    if((i & 0xF0000000) == 0){
        i <<= 4;
        bit -= 4;
    }

    while((i & 0x80000000) == 0){
        i <<= 1;
        bit--;
    }

    return bit; 
}

int targetIndex = floor(log(i + 0.5)/log(2.0));

This isn't standard or necessarily portable, but it will in general work. I don't know how efficient it is.

Convert the integer index into a floating-point number of sufficient precision. The representation will be exact, assuming the precision is sufficient.

Look up the representation of IEEE floating-point numbers, extract the exponent, and make the necessary adjustment to find the base 2 log.

If you're using C++11 you can make this a constexpr function:

constexpr std::uint32_t log2(std::uint32_t n)
{
    return (n > 1) ? 1 + log2(n >> 1) : 0;
}

There are similar answers above. This answer

Works with 64 bit numbers
Lets you choose the type of rounding and
Includes test/sample code

Functions:

    static int floorLog2(int64_t x)
    { 
      assert(x > 0);
      return 63 - __builtin_clzl(x);
    }

    static int ceilLog2(int64_t x)
    {
      if (x == 1)
        // On my system __builtin_clzl(0) returns 63.  64 would make more sense   
        // and would be more consistent.  According to stackoverflow this result  
        // can get even stranger and you should just avoid __builtin_clzl(0).     
        return 0;
      else
        return floorLog2(x-1) + 1;
    }

Test Code:

for (int i = 1; i < 35; i++)
  std::cout<<"floorLog2("<<i<<") = "<<floorLog2(i)
           <<", ceilLog2("<<i<<") = "<<ceilLog2(i)<<std::endl;

This function determines how many bits are required to represent the numeric interval: [0..maxvalue].

unsigned binary_depth( unsigned maxvalue )
   {
   int depth=0;
   while ( maxvalue ) maxvalue>>=1, depth++;
   return depth;
   }

By subtracting 1 from the result, you get floor(log2(x)), which is an exact representation of log2(x) when x is a power of 2.

xyy-1
00-1
110
221
321
432
532
632
732
843

How deep do you project your tree to be? You could set a range of say... +/- 0.00000001 to the number to force it to an integer value.

I'm actually not certain you'll hit a number like 1.99999999 because your log2 should not lose any accuracy when calculating 2^n values (Since floating point rounds to the nearest power of 2).

bobobobo

This function I wrote here

// The 'i' is for int, there is a log2 for double in stdclib
inline unsigned int log2i( unsigned int x )
{
  unsigned int log2Val = 0 ;
  // Count push off bits to right until 0
  // 101 => 10 => 1 => 0
  // which means hibit was 3rd bit, its value is 2^3
  while( x>>=1 ) log2Val++;  // div by 2 until find log2.  log_2(63)=5.97, so
  // take that as 5, (this is a traditional integer function!)
  // eg x=63 (111111), log2Val=5 (last one isn't counted by the while loop)
  return log2Val ;
}

Andrea993

This is an old post but I share my one line algorithm:

unsigned uintlog2(unsigned x)
{
   unsigned l;
   for(l=0; x>1; x>>=1, l++);
   return l;
}

Rewriting Todd Lehman's answer to be more generic:

#include <climits>

template<typename N>
constexpr N ilog2(N n) {
    N i = 0;
    for (N k = sizeof(N) * CHAR_BIT; 0 < (k /= 2);) {
        if (n >= static_cast<N>(1) << k) { i += k; n >>= k; }
    }
    return i;
}

Clang with -O3 unrolls the loop:

0000000100000f50    pushq   %rbp
0000000100000f51    movq    %rsp, %rbp
0000000100000f54    xorl    %eax, %eax
0000000100000f56    cmpl    $0xffff, %edi
0000000100000f5c    setg    %al
0000000100000f5f    shll    $0x4, %eax
0000000100000f62    movl    %eax, %ecx
0000000100000f64    sarl    %cl, %edi
0000000100000f66    xorl    %edx, %edx
0000000100000f68    cmpl    $0xff, %edi
0000000100000f6e    setg    %dl
0000000100000f71    leal    (,%rdx,8), %ecx
0000000100000f78    sarl    %cl, %edi
0000000100000f7a    leal    (%rax,%rdx,8), %eax
0000000100000f7d    xorl    %edx, %edx
0000000100000f7f    cmpl    $0xf, %edi
0000000100000f82    setg    %dl
0000000100000f85    leal    (,%rdx,4), %ecx
0000000100000f8c    sarl    %cl, %edi
0000000100000f8e    leal    (%rax,%rdx,4), %eax
0000000100000f91    xorl    %edx, %edx
0000000100000f93    cmpl    $0x3, %edi
0000000100000f96    setg    %dl
0000000100000f99    leal    (%rdx,%rdx), %ecx
0000000100000f9c    sarl    %cl, %edi
0000000100000f9e    leal    (%rax,%rdx,2), %ecx
0000000100000fa1    xorl    %eax, %eax
0000000100000fa3    cmpl    $0x1, %edi
0000000100000fa6    setg    %al
0000000100000fa9    orl %ecx, %eax
0000000100000fab    popq    %rbp

When n is constant, result is computed in compilation time.

Given the way floating point numbers work (crudely, mantissa * 2^exponent), then any number up to 2^127 that is a power of 2 will be exactly represented without error.

This does give a trivial but rather hacky solution - interpret the bit pattern of the floating point number as an integer, and just look at the exponent. This is David Thornley's solution above.

float f = 1;
for (int i = 0; i < 128; i++)
{
    int x = (*(int*)(&f)>>23) - 127;
    int l = int(log(f) / log(2));

    printf("i = %d, log = %d, f = %f quick = %d\n",
        i, l, f, x);
    f *= 2;
}

It is not true that any integer can be represented as a float - only those with fewer bits than the mantissa can represent. In 32bit floats, that is 23 bits worth.

来源：https://stackoverflow.com/questions/994593/how-to-do-an-integer-log2-in-c

标签

c++

floating-accuracy

logarithm