python bit array (performant)

前端 未结 3 2002
天涯浪人
天涯浪人 2020-12-31 13:49

I\'m designing a bloom filter and I\'m wondering what the most performant bit array implementation is in Python.

The nice thing about Python is that it can handle ar

3条回答
  •  我在风中等你
    2020-12-31 14:19

    Disclaimer: I am the main developer of intbitset :-) which was mentioned above in one of the comments. This is just to let you know that since some weeks intbitset is now compatible with Python 3.3 and 3.4. Additionally it looks like it goes almost twice as fast WRT the native int functionality:

    import random
    from intbitset import intbitset
    x = random.sample(range(1000000), 10000)
    y = random.sample(range(1000000), 10000)
    m = 0
    for i in x:                 
        m += 1 << i
    n = 0
    for i in x:                 
        n += 1 << i
    mi = intbitset(x)
    ni = intbitset(y)
    
    %timeit m & n ## native int
    10000 loops, best of 3: 27.3 µs per loop
    
    %timeit mi & ni ## intbitset
    100000 loops, best of 3: 13.9 µs per loop
    
    %timeit m | n ## native int
    10000 loops, best of 3: 26.8 µs per loop
    
    %timeit mi | ni ## intbitset
    100000 loops, best of 3: 15.8 µs per loop
    
    ## note the above were just tested on Python 2.7, Ubuntu 14.04.
    

    Additionally intbitset supports some unique features such as infinite sets, which are useful e.g. to build search engine where you have the concept of universe (e.g. taking the union of an infinite set with a regular set will return an infinite set, etc.)

    For more information about intbitset performance WRT Python sets see instead: http://intbitset.readthedocs.org/en/latest/#performance

提交回复
热议问题