improving code efficiency: standard deviation on sliding windows

前端未结

关注

 4  1513

I am trying to improve function which calculate for each pixel of an image the standard deviation of the pixels located in the neighborhood of the pixel. My function uses tw

相关标签:

4条回答

情话喂你

2020-12-03 03:33

You can first obtain the indices and then use np.take to form the new array:

def new_std_dev(image_original,radius=5):
    cols,rows=image_original.shape

    #First obtain the indices for the top left position
    diameter=np.arange(radius*2)
    x,y=np.meshgrid(diameter,diameter)
    index=np.ravel_multi_index((y,x),(cols,rows)).ravel()

    #Cast this in two dimesions and take the stdev
    index=index+np.arange(rows-radius*2)[:,None]+np.arange(cols-radius*2)[:,None,None]*(rows)
    data=np.std(np.take(image_original,index),-1)

    #Add the zeros back to the output array
    top=np.zeros((radius,rows-radius*2))
    sides=np.zeros((cols,radius))

    data=np.vstack((top,data,top))
    data=np.hstack((sides,data,sides))
    return data

First generate some random data and check timings:

a=np.random.rand(50,20)

print np.allclose(new_std_dev(a),sliding_std_dev(a))
True

%timeit sliding_std_dev(a)
100 loops, best of 3: 18 ms per loop

%timeit new_std_dev(a)
1000 loops, best of 3: 472 us per loop

For larger arrays its always faster as long as you have enough memory:

a=np.random.rand(200,200)

print np.allclose(new_std_dev(a),sliding_std_dev(a))
True

%timeit sliding_std_dev(a)
1 loops, best of 3: 1.58 s per loop

%timeit new_std_dev(a)
10 loops, best of 3: 52.3 ms per loop

The original function is faster for very small arrays, it looks like the break even point is when hgt*wdt >50. Something to note your function is taking square frames and placing the std dev in the bottom right index, not sampling around the index. Is this intentional?

0 讨论(0)

南笙

2020-12-03 03:44

The most often used method to do this kind of things in image processing is using summed area tables, an idea introduced in this paper in 1984. The idea is that, when you compute a quantity by adding over a window, and move the window e.g. one pixel to the right, you don't need to add all the items in the new window, you only need to subtract the leftmost column from the total, and add the new rightmost column. So if you create an accumulated sum array over both dimensions from your array, you can get the sum over a window with a couple of sums and a subtraction. If you keep summed area tables for your array and its square, it's very easy to get the variance from those two. Here's an implementation:

def windowed_sum(a, win):
    table = np.cumsum(np.cumsum(a, axis=0), axis=1)
    win_sum = np.empty(tuple(np.subtract(a.shape, win-1)))
    win_sum[0,0] = table[win-1, win-1]
    win_sum[0, 1:] = table[win-1, win:] - table[win-1, :-win]
    win_sum[1:, 0] = table[win:, win-1] - table[:-win, win-1]
    win_sum[1:, 1:] = (table[win:, win:] + table[:-win, :-win] -
                       table[win:, :-win] - table[:-win, win:])
    return win_sum

def windowed_var(a, win):
    win_a = windowed_sum(a, win)
    win_a2 = windowed_sum(a*a, win)
    return (win_a2 - win_a * win_a / win/ win) / win / win

To see that this works:

>>> a = np.arange(25).reshape(5,5)
>>> windowed_var(a, 3)
array([[ 17.33333333,  17.33333333,  17.33333333],
       [ 17.33333333,  17.33333333,  17.33333333],
       [ 17.33333333,  17.33333333,  17.33333333]])
>>> np.var(a[:3, :3])
17.333333333333332
>>> np.var(a[-3:, -3:])
17.333333333333332

This should run a couple of notches faster than convolution based methods.

0 讨论(0)

闹比i

2020-12-03 03:52

First off, there's more than one way to do this.

It's not the most efficient speed-wise, but using scipy.ndimage.generic_filter will allow you to easily apply an arbitrary python function over a moving window.

As a quick example:

result = scipy.ndimage.generic_filter(data, np.std, size=2*radius)

Note that the boundary conditions can be controlled by mode kwarg.

Another way to do this is to use some various striding tricks to make a view of the array that's effectively a moving window, and then apply np.std along the last axis. (Note: this is taken from one of my previous answers here: https://stackoverflow.com/a/4947453/325565)

def strided_sliding_std_dev(data, radius=5):
    windowed = rolling_window(data, (2*radius, 2*radius))
    shape = windowed.shape
    windowed = windowed.reshape(shape[0], shape[1], -1)
    return windowed.std(axis=-1)

def rolling_window(a, window):
    """Takes a numpy array *a* and a sequence of (or single) *window* lengths
    and returns a view of *a* that represents a moving window."""
    if not hasattr(window, '__iter__'):
        return rolling_window_lastaxis(a, window)
    for i, win in enumerate(window):
        if win > 1:
            a = a.swapaxes(i, -1)
            a = rolling_window_lastaxis(a, win)
            a = a.swapaxes(-2, i)
    return a

def rolling_window_lastaxis(a, window):
    """Directly taken from Erik Rigtorp's post to numpy-discussion.
    <http://www.mail-archive.com/numpy-discussion@scipy.org/msg29450.html>"""
    if window < 1:
       raise ValueError, "`window` must be at least 1."
    if window > a.shape[-1]:
       raise ValueError, "`window` is too long."
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)

It's a bit hard to understand what's going on here at first glance. Not to plug one of my own answers, but I don't want to re-type the explanation, so have a look here: https://stackoverflow.com/a/4924433/325565 if you haven't see these sorts of "striding" tricks before.

If we compare timings with a 100x100 array of random floats with a radius of 5, it's ~10x faster than the original or the generic_filter version. However, you have no flexibility in the boundary conditions with this version. (It's identical to what you're currently doing, while the generic_filter version gives you lots of flexibility at the expense of speed.)

# Your original function with nested loops
In [21]: %timeit sliding_std_dev(data)
1 loops, best of 3: 237 ms per loop

# Using scipy.ndimage.generic_filter
In [22]: %timeit ndimage_std_dev(data)
1 loops, best of 3: 244 ms per loop

# The "stride-tricks" version above
In [23]: %timeit strided_sliding_std_dev(data)
100 loops, best of 3: 15.4 ms per loop

# Ophion's version that uses `np.take`
In [24]: %timeit new_std_dev(data)
100 loops, best of 3: 19.3 ms per loop

The downside to the "stride-tricks" version is that unlike "normal" strided rolling window tricks, this version does make a copy, and it's much larger than the original array. You will run into memory problems if you use this on a large array! (On a side note, it's basically equivalent to @Ophion's answer in terms of memory use and speed. It's just a different approach to doing the same thing.)

0 讨论(0)

轮回少年

2020-12-03 03:55
Cool trick: you can compute the standard deviation given just the sum of squared values and the sum of values in the window.

Therefore, you can compute the standard deviation very fast using a uniform filter on the data:
```
from scipy.ndimage.filters import uniform_filter

def window_stdev(arr, radius):
    c1 = uniform_filter(arr, radius*2, mode='constant', origin=-radius)
    c2 = uniform_filter(arr*arr, radius*2, mode='constant', origin=-radius)
    return ((c2 - c1*c1)**.5)[:-radius*2+1,:-radius*2+1]
```
This is ridiculously faster than the original function. For a 1024x1024 array and a radius of 20, the old function takes 34.11 seconds, and the new function takes 0.11 seconds, a speed-up of 300-fold.

How does this work mathematically? It computes the quantity sqrt(mean(x^2) - mean(x)^2) for each window. We can derive this quantity from the standard deviation sqrt(mean((x - mean(x))^2)) as follows:

Let E be the expectation operator (basically mean()), and X be the random variable of data. Then:

E[(X - E[X])^2]
= E[X^2 - 2X*E[X] + E[X]^2]
= E[X^2] - E[2X*E[X]] + E[E[X]^2] (by the linearity of the expectation operator)
= E[X^2] - 2E[X]*E[X] + E[X]^2 (again by linearity, and the fact that E[X] is a constant)
= E[X^2] - E[X]^2

which proves that the quantity computed using this technique is mathematically equivalent to the standard deviation.
0 讨论(0)
发布评论:

提交评论
- 加载中...