I am implementing an algorithm which requires me to look at non-overlapping consecutive submatrices within a (strictly two dimensional) numpy array. eg, for the 12 by 12
I'm adding this answer to an old question since an edit has bumped this question up. Here's an alternative way to calculate blocks:
size = 3
lenr, lenc = int(a.shape[0]/size), int(a.shape[1]/size)
t = a.reshape(lenr,size,lenc,size).transpose(0, 2, 1, 3)
Profiling shows that this is the fastest. Profiling done with python 3.5, and the results from map passed to array() for compatibility, since in 3.5 map returns an iterator.
reshape/transpose: 643 ns per loop
reshape/index: 45.8 µs per loop
Map/split: 10.3 µs per loop
It's interesting that the iterator version of map is faster. In any case, using reshape and transpose is fastest.