I am trying to improve function which calculate for each pixel of an image the standard deviation of the pixels located in the neighborhood of the pixel. My function uses tw
You can first obtain the indices and then use np.take to form the new array:
def new_std_dev(image_original,radius=5):
cols,rows=image_original.shape
#First obtain the indices for the top left position
diameter=np.arange(radius*2)
x,y=np.meshgrid(diameter,diameter)
index=np.ravel_multi_index((y,x),(cols,rows)).ravel()
#Cast this in two dimesions and take the stdev
index=index+np.arange(rows-radius*2)[:,None]+np.arange(cols-radius*2)[:,None,None]*(rows)
data=np.std(np.take(image_original,index),-1)
#Add the zeros back to the output array
top=np.zeros((radius,rows-radius*2))
sides=np.zeros((cols,radius))
data=np.vstack((top,data,top))
data=np.hstack((sides,data,sides))
return data
First generate some random data and check timings:
a=np.random.rand(50,20)
print np.allclose(new_std_dev(a),sliding_std_dev(a))
True
%timeit sliding_std_dev(a)
100 loops, best of 3: 18 ms per loop
%timeit new_std_dev(a)
1000 loops, best of 3: 472 us per loop
For larger arrays its always faster as long as you have enough memory:
a=np.random.rand(200,200)
print np.allclose(new_std_dev(a),sliding_std_dev(a))
True
%timeit sliding_std_dev(a)
1 loops, best of 3: 1.58 s per loop
%timeit new_std_dev(a)
10 loops, best of 3: 52.3 ms per loop
The original function is faster for very small arrays, it looks like the break even point is when hgt*wdt >50. Something to note your function is taking square frames and placing the std dev in the bottom right index, not sampling around the index. Is this intentional?