I have hundreds of thousands of NumPy boolean arrays that I would like to use as keys to a dictionary. (The values of this dictionary are the number of times we\'ve observed
I would convert the array to an bitfield using np.packbits. This is fairly memory efficient, it uses all the bits of a byte. Still the code is relatively simple.
import numpy as np
array=np.array([True,False]*20)
Hash=np.packbits(array).tostring()
dict={}
dict[Hash]=10
print(np.unpackbits(np.fromstring(Hash,np.uint8)).astype(np.bool)[:len((array)])
Be careful with variable length bool arrays the code does not distinguish between an all False array of for example 6 or 7 members. For moredimensional arrays you will need some reshaping..
If this is still not efficient enough, and your arrays are large, you might be able to reduce the memory further by packing:
import bz2
Hash_compressed=bz2.compress(Hash,1)
It does not work for random, uncompressible data though