I am attempting to speed up a binary file parser I wrote last year by doing the parsing/data accumulation in numpy. numpy\'s ability to define customized data structures and
Numpy doesn't support arbitrary-bytelength integers, and using ctypes bitfields would be more trouble than it's worth.
I'd suggest using vectorised slicing to convert your data to the next-higher standard size integer:
buf = "000000111111222222"
a = np.ndarray(len(buf), np.dtype('>i1'), buf)
e = np.zeros(len(buf) / 6, np.dtype('>i8'))
for i in range(3):
e.view(dtype='>i2')[i + 1::4] = a.view(dtype='>i2')[i::3]
[hex(x) for x in e]