I am working on a Python library that performs a lot of bitwise operations on long bit strings, and I want to find a bit string type that will maximize its speed. I have tri
As far as I can tell, the built-in Python 3 int is the only one of the options you tested that computes the & in chunks of more than one byte at a time. (I haven't fully figured out what everything in the NumPy source for this operation does, but it doesn't look like it has an optimization to compute this in chunks bigger than the dtype.)
bitarray goes byte-by-byte,In contrast, the int operation goes by either 15-bit or 30-bit digits, depending on the value of the compile-time parameter PYLONG_BITS_IN_DIGIT. I don't know which setting is the default.
You can speed up the NumPy attempt by using a packed representation and a larger dtype. It looks like on my machine, a 32-bit dtype works fastest, beating Python ints; I don't know what it's like on your setup. Testing with 10240-bit values in each format, I get
>>> timeit.timeit('a & b', 'import numpy; a = b = numpy.array([0]*160, dtype=num
py.uint64)')
1.3918750826524047
>>> timeit.timeit('a & b', 'import numpy; a = b = numpy.array([0]*160*8, dtype=n
umpy.uint8)')
1.9460716604953632
>>> timeit.timeit('a & b', 'import numpy; a = b = numpy.array([0]*160*2, dtype=n
umpy.uint32)')
1.1728465435917315
>>> timeit.timeit('a & b', 'a = b = 2**10240-1')
1.5999407862400403