I have a NumPy array [1,2,3,4,5,6,7,8,9,10,11,12,13,14]
and want to have an array structured like [[1,2,3,4], [2,3,4,5], [3,4,5,6], ..., [11,12,13,14]]
The efficient NumPy way to do this is given here, which is somewhat too long to reproduce here. It boils down to using some stride tricks, and is much faster than itertools for largish window sizes. For example, using a method essentially the same as that of of Alex Martelli's:
In [16]: def windowed(sequence, length):
seqs = tee(sequence, length)
[ seq.next() for i, seq in enumerate(seqs) for j in xrange(i) ]
return zip(*seqs)
We get:
In [19]: data = numpy.random.randint(0, 2, 1000000)
In [20]: %timeit windowed(data, 2)
100000 loops, best of 3: 6.62 us per loop
In [21]: %timeit windowed(data, 10)
10000 loops, best of 3: 29.3 us per loop
In [22]: %timeit windowed(data, 100)
1000 loops, best of 3: 1.41 ms per loop
In [23]: %timeit segment_axis(data, 2, 1)
10000 loops, best of 3: 30.1 us per loop
In [24]: %timeit segment_axis(data, 10, 9)
10000 loops, best of 3: 30.2 us per loop
In [25]: %timeit segment_axis(data, 100, 99)
10000 loops, best of 3: 30.5 us per loop