I have integers in the range 0..2**m - 1
and I would like to convert them to binary numpy arrays of length m
. For example, say m = 4
.
Here's a somewhat 'hacky' solution.
def bin_array(num, m):
"""Returns an array representing the binary representation of num in m bits."""
bytes = int(math.ceil(m / 8.0))
num_arr = np.arange(num, num+1, dtype='>i%d' %(bytes))
return np.unpackbits(num_arr.view(np.uint8))[-1*m:]
One-line version, taking advantage of the fast path in numpy.binary_repr
:
def bin_array(num, m):
"""Convert a positive integer num into an m-bit bit vector"""
return np.array(list(np.binary_repr(num).zfill(m))).astype(np.int8)
Example:
In [1]: bin_array(15, 6)
Out[1]: array([0, 0, 1, 1, 1, 1], dtype=int8)
Vectorized version for expanding an entire numpy array of ints at once:
def vec_bin_array(arr, m):
"""
Arguments:
arr: Numpy array of positive integers
m: Number of bits of each integer to retain
Returns a copy of arr with every element replaced with a bit vector.
Bits encoded as int8's.
"""
to_str_func = np.vectorize(lambda x: np.binary_repr(x).zfill(m))
strs = to_str_func(arr)
ret = np.zeros(list(arr.shape) + [m], dtype=np.int8)
for bit_ix in range(0, m):
fetch_bit_func = np.vectorize(lambda x: x[bit_ix] == '1')
ret[...,bit_ix] = fetch_bit_func(strs).astype("int8")
return ret
Example:
In [1]: vec_bin_array(np.array([[100, 42], [2, 5]]), 8)
Out[1]: array([[[0, 1, 1, 0, 0, 1, 0, 0],
[0, 0, 1, 0, 1, 0, 1, 0]],
[[0, 0, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 1, 0, 1]]], dtype=int8)
You should be able to vectorize this, something like
>>> d = np.array([1,2,3,4,5])
>>> m = 8
>>> (((d[:,None] & (1 << np.arange(m)))) > 0).astype(int)
array([[1, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0],
[1, 0, 1, 0, 0, 0, 0, 0]])
which just gets the appropriate bit weights and then takes the bitwise and:
>>> (1 << np.arange(m))
array([ 1, 2, 4, 8, 16, 32, 64, 128])
>>> d[:,None] & (1 << np.arange(m))
array([[1, 0, 0, 0, 0, 0, 0, 0],
[0, 2, 0, 0, 0, 0, 0, 0],
[1, 2, 0, 0, 0, 0, 0, 0],
[0, 0, 4, 0, 0, 0, 0, 0],
[1, 0, 4, 0, 0, 0, 0, 0]])
There are lots of ways to convert this to 1s wherever it's non-zero (> 0)*1
, .astype(bool).astype(int)
, etc. I chose one basically at random.
Seems like you could just modify the resulting array. I don't know the function exactly, but most implementations like np.unpackbits
would not inherently know the size of the number - python ints can be arbitrarily large, after all, and don't have a native size.
However, if you know m
, you can easily 'fix' the array. Basically, an unpack function will give you some number of bits (that is a multiple of 8) for the byte with the highest 1 in the number. You just need to remove extra 0s, or prepend 0s, to get the right distance:
m = 4
mval = np.unpackbits(np.uint8(15))
if len(mval) > m:
mval = mval[m-len(mval):]
elif m > len(mval):
# Create an extra array, and extend it
mval = numpy.concatenate([numpy.array([0]*(m-len(mval)), dtype=uint8), mval])