Finding the exponent of n = 2**x using bitwise operations [logarithm in base 2 of n]

后端 未结 7 1086
猫巷女王i
猫巷女王i 2020-12-31 05:47

Is there a straightforward way to extracting the exponent from a power of 2 using bitwise operations only?

EDIT: Although the question was originall

相关标签:
7条回答
  • 2020-12-31 06:12

    You can do it in O(lg s) time for arbitrary length integers using a binsearch.

    import sys
    def floorlg(n):
        if n < 1:
            return -1
        low=0
        high=sys.getsizeof(n)*8 # not the best upper-bound guesstimate, but...
        while True:
            mid = (low+high)//2
            i = n >> mid
            if i == 1:
                return mid
            if i == 0:
                high = mid-1
            else:
                low = mid+1
    

    For fixed size integers, a lookup table should be the fastest solution, and probably best overall.

    0 讨论(0)
  • 2020-12-31 06:18

    Late to the party, but how about int.bit_length(n) - 1? You asked for straightforward, and that seems the simplest to me. The CPython implementation looks reasonably performant.

    • Reference documentation
    • CPython source code
    0 讨论(0)
  • 2020-12-31 06:22

    It seems like the range is known. Let's assume it goes up to 1<<20, just to make it more interesting:

    max_log2=20
    

    So make a list that (in effect) maps an integer to its base 2 logarithm. The following will do the trick:

    log2s_table=[0]*((1<<max_log2)+1)
    for i in range(max_log2+1):
        log2s_table[1<<i]=i
    

    (This doesn't do anything useful for numbers that aren't powers of two; the problem statement suggests they don't need to be handled. Would be easy enough to fix that though.)

    The function to get the logarithm is very simple, and could easily be inlined:

    def table(v):
        return log2s_table[v]
    

    I can't guarantee that the test code I wrote is exactly the same as the one being used to get the example timings, but this is rather quicker than the stringcount code:

    stringcount: 0.43 s.
    table: 0.16 s.
    

    Since all the values in the table are less than 256, I wondered whether using a string instead of a list would be quicker, or maybe an array.array of bytes, but no dice:

    string: 0.25 s.
    arr: 0.21 s.
    

    Using a dict to do the lookup is another possibility, taking advantage of the way only powers of two are being checked:

    log2s_map=dict([(1<<x,x) for x in range(max_log2+1)])
    
    def map(v):
        return log2s_map[v]
    

    The results for this weren't as good, though:

    map: 0.20 s.
    

    And just for fun one could also use the hex method on float objects to get a string that includes (as its last part) the base 2 exponent of the number. This is a bit slow to extract in general, but if the exponent is only ever going to be one digit it could be done straightforwardly enough:

    def floathex(v):
        return ord(float(v).hex()[-1])-48
    

    This is purely for entertainment value though as it was uncompetetive -- though, amazingly, still quicker than the bitwise approach.

    So it looks like using a list is the way to go.

    (This approach won't scale indefinitely, due to limited memory, but to make up for that the execution speed won't depend on max_log2, or the input values, in any way that you'll notice when running python code. Regarding the memory consumption, if I remember my python internals correctly, the table will take up about (1<<max_log2)*4 bytes, because the contents are all small integers that the interpreter will intern automatically. SO, when max_log2 is 20, that's about 4MB.)

    0 讨论(0)
  • 2020-12-31 06:24

    Short answer

    As far as python is concerned:

    • The fastest method of all to find the exponent of 2**x is by looking up in a dictionary whose hashes are the powers of 2 (see "hashlookup" in the code)
    • The fastest bitwise method is the one called "unrolled_bitwise".
    • Both previous methods have well-defined (but extensible) upper limits. The fastest method without hard-coded upper limits (which scales up as far as python can handle numbers) is "log_e".

    Preliminary notes

    1. All speed measurements below have been obtained via timeit.Timer.repeat(testn, cycles) where testn was set to 3 and cycles was automatically adjusted by the script to obtain times in the range of seconds (note: there was a bug in this auto-adjusting mechanism that has been fixed on 18/02/2010).
    2. Not all methods can scale, this is why I did not test all functions for the various powers of 2
    3. I did not manage to get some of the proposed methods to work (the function returns a wrong result). I did not yet have tiem to do a step-by-step debugging session: I included the code (commented) just in case somebody spots the mistake by inspection (or want to perform the debug themselves)

    Results

    func(25)**

    hashlookup:          0.13s     100%
    lookup:              0.15s     109%
    stringcount:         0.29s     220%
    unrolled_bitwise:    0.36s     272%
    log_e:               0.60s     450%
    bitcounter:          0.64s     479%
    log_2:               0.69s     515%
    ilog:                0.81s     609%
    bitwise:             1.10s     821%
    olgn:                1.42s    1065%
    

    func(231)**

    hashlookup:          0.11s     100%
    unrolled_bitwise:    0.26s     229%
    log_e:               0.30s     268%
    stringcount:         0.30s     270%
    log_2:               0.34s     301%
    ilog:                0.41s     363%
    bitwise:             0.87s     778%
    olgn:                1.02s     912%
    bitcounter:          1.42s    1264%
    

    func(2128)**

    hashlookup:     0.01s     100%
    stringcount:    0.03s     264%
    log_e:          0.04s     315%
    log_2:          0.04s     383%
    olgn:           0.18s    1585%
    bitcounter:     1.41s   12393%
    

    func(21024)**

    log_e:          0.00s     100%
    log_2:          0.01s     118%
    stringcount:    0.02s     354%
    olgn:           0.03s     707%
    bitcounter:     1.73s   37695%
    

    Code

    import math, sys
    
    def stringcount(v):
        """mac"""    
        return len(bin(v)) - 3
    
    def log_2(v):
        """mac"""    
        return int(round(math.log(v, 2), 0)) # 2**101 generates 100.999999999
    
    def log_e(v):
        """bp on mac"""    
        return int(round(math.log(v)/0.69314718055994529, 0))  # 0.69 == log(2)
    
    def bitcounter(v):
        """John Y on mac"""
        r = 0
        while v > 1 :
            v >>= 1
            r += 1
        return r
    
    def olgn(n) :
        """outis"""
        if n < 1:
            return -1
        low = 0
        high = sys.getsizeof(n)*8 # not the best upper-bound guesstimate, but...
        while True:
            mid = (low+high)//2
            i = n >> mid
            if i == 1:
                return mid
            if i == 0:
                high = mid-1
            else:
                low = mid+1
    
    def hashlookup(v):
        """mac on brone -- limit: v < 2**131"""
    #    def prepareTable(max_log2=130) :
    #        hash_table = {}
    #        for p in range(1, max_log2) :
    #            hash_table[2**p] = p
    #        return hash_table
    
        global hash_table
        return hash_table[v] 
    
    def lookup(v):
        """brone -- limit: v < 2**11"""
    #    def prepareTable(max_log2=10) :
    #        log2s_table=[0]*((1<<max_log2)+1)
    #        for i in range(max_log2+1):
    #            log2s_table[1<<i]=i
    #        return tuple(log2s_table)
    
        global log2s_table
        return log2s_table[v]
    
    def bitwise(v):
        """Mark Byers -- limit: v < 2**32"""
        b = (0x2, 0xC, 0xF0, 0xFF00, 0xFFFF0000)
        S = (1, 2, 4, 8, 16)
        r = 0
        for i in range(4, -1, -1) :
            if (v & b[i]) :
                v >>= S[i];
                r |= S[i];
        return r
    
    def unrolled_bitwise(v):
        """x4u on Mark Byers -- limit:   v < 2**33"""
        r = 0;
        if v > 0xffff : 
            v >>= 16
            r = 16;
        if v > 0x00ff :
            v >>=  8
            r += 8;
        if v > 0x000f :
            v >>=  4
            r += 4;
        if v > 0x0003 : 
            v >>=  2
            r += 2;
        return r + (v >> 1)
    
    def ilog(v):
        """Gregory Maxwell - (Original code: B. Terriberry) -- limit: v < 2**32"""
        ret = 1
        m = (not not v & 0xFFFF0000) << 4;
        v >>= m;
        ret |= m;
        m = (not not v & 0xFF00) << 3;
        v >>= m;
        ret |= m;
        m = (not not v & 0xF0) << 2;
        v >>= m;
        ret |= m;
        m = (not not v & 0xC) << 1;
        v >>= m;
        ret |= m;
        ret += (not not v & 0x2);
        return ret - 1;
    
    
    # following table is equal to "return hashlookup.prepareTable()" 
    hash_table = {...} # numbers have been cut out to avoid cluttering the post
    
    # following table is equal to "return lookup.prepareTable()" - cached for speed
    log2s_table = (...) # numbers have been cut out to avoid cluttering the post
    
    0 讨论(0)
  • 2020-12-31 06:28

    "language agnostic" and worrying about performance are pretty much incompatible concepts.

    Most modern processors have a CLZ instruction, "count leading zeros". In GCC you can get to it with __builtin_clz(x) (which also produces reasonable, if not the fastest, code for targets that lack clz). Note that this CLZ is undefined for zero, so you'll need an extra branch to catch that case if it matters in your application.

    In CELT ( http://celt-codec.org ) the branchless CLZ we use for compliers lacking a CLZ was written by Timothy B. Terriberry:

    
    int ilog(uint32 _v){
      int ret;
      int m;
      ret=!!_v;
      m=!!(_v&0xFFFF0000)<<4;
      _v>>=m;
      ret|=m;
      m=!!(_v&0xFF00)<<3;
      _v>>=m;
      ret|=m;
      m=!!(_v&0xF0)<<2;
      _v>>=m;
      ret|=m;
      m=!!(_v&0xC)<<1;
      _v>>=m;
      ret|=m;
      ret+=!!(_v&0x2);
      return ret;
    }
    

    (The comments indicate that this was found to be faster than a branching version and a lookup table based version)

    But if performance is that critical you probably shouldn't be implementing this part of your code in python.

    0 讨论(0)
  • 2020-12-31 06:29

    This is actually a comment to the performance test posted by mac. I post this as an answer to have proper code formatting and indenting

    mac, could you try a unrolled implementation of the bitseach suggested by Mark Byers? Maybe it's just the array access that slows it down. In theory this approach should be faster than the others.

    It would look something like this, although I'm not sure whether the formatting is right for python but I guess you can see what it is supposed to do.

    def bitwise(v):
        r = 0;
        if( v > 0xffff ) : v >>= 16; r = 16;
        if( v > 0x00ff ) : v >>=  8; r += 8;
        if( v > 0x000f ) : v >>=  4; r += 4;
        if( v > 0x0003 ) : v >>=  2; r += 2;
        return r + ( v >> 1 );
    

    If python shares java's lack of unsingned integers it would need to be something like this:

    def bitwise(v):
        r = 0;
        if( v & 0xffff0000 ) : v >>>= 16; r = 16;
        if( v > 0x00ff ) : v >>=  8; r += 8;
        if( v > 0x000f ) : v >>=  4; r += 4;
        if( v > 0x0003 ) : v >>=  2; r += 2;
        return r + ( v >> 1 );
    
    0 讨论(0)
提交回复
热议问题