Python - vectorizing a sliding window

前端 未结 4 1915
既然无缘
既然无缘 2021-01-05 17:16

I\'m trying to vectorize a sliding window operation. For the 1-d case a helpful example could go along the lines of:

x= vstack((np.array([range(10)]),np.arr         


        
4条回答
  •  清歌不尽
    2021-01-05 18:16

    The problem lies in x[1,x[0,:]+1], the index for the 2nd axis: x[0,:]+1 is [1 2 3 4 5 6 7 8 9 10], in which index 10 is larger than the dimension of x.

    In the case of x[1,x[0,:]-1], the index of the 2nd axis is [-1 0 1 2 3 4 5 6 7 8 9], you end up getting [9 0 1 2 3 4 5 6 7 8], as 9 is the last element and has an index of -1. The index of the second element from the end is -2 and so on.

    With np.where((x[0,:]<5)&(x[0,:]>0),x[1,x[0,:]-1],x[1,:]) and x[0,:]=[0 1 2 3 4 5 6 7 8 9], what essentially is going on is that the first cell is taken form x[1,:] because x[0,0] is 0 and x[0,:]<5)&(x[0,:]>0 is False. The next four elements are taken from x[1,x[0,:]-1]. The rest are from x[1,:]. Finally the result is [0 0 1 2 3 4 5 6 7 8]

    It may appear to be OK for sliding-window of just 1 cell, but it's gonna surprise you with:

    >>> np.where((x[0,:]<5)&(x[0,:]>0),x[1,x[0,:]-2],x[1,:])
    array([0, 9, 0, 1, 2, 5, 6, 7, 8, 9])
    

    When you try to move it by a windows of two cells.

    For this specific problem, if we want to keep every thing in one line, this, will do:

    >>> for i in [1, 2, 3, 4, 5, 6]:
        print hstack((np.where(x[1,x[0,:]-i]

    Edit: Now I understand your original question better, basically you want to take a 2D array and calculate N*N cell average around each cell. That is quite common. First you probably want to limit N to odd numbers, otherwise such thing as 2*2 average around a cell is difficult to define. Suppose we want 3*3 average:

    #In this example, the shape is (10,10)
    >>> a1=\
    array([[3, 7, 0, 9, 0, 8, 1, 4, 3, 3],
       [5, 6, 5, 2, 9, 2, 3, 5, 2, 9],
       [0, 9, 8, 5, 3, 1, 8, 1, 9, 4],
       [7, 4, 0, 0, 9, 3, 3, 3, 5, 4],
       [3, 1, 2, 4, 8, 8, 2, 1, 9, 6],
       [0, 0, 3, 9, 3, 0, 9, 1, 3, 3],
       [1, 2, 7, 4, 6, 6, 2, 6, 2, 1],
       [3, 9, 8, 5, 0, 3, 1, 4, 0, 5],
       [0, 3, 1, 4, 9, 9, 7, 5, 4, 5],
       [4, 3, 8, 7, 8, 6, 8, 1, 1, 8]])
    #move your original array 'a1' around, use range(-2,2) for 5*5 average and so on
    >>> movea1=[a1[np.clip(np.arange(10)+i, 0, 9)][:,np.clip(np.arange(10)+j, 0, 9)] for i, j in itertools.product(*[range(-1,2),]*2)]
    #then just take the average
    >>> averagea1=np.mean(np.array(movea1), axis=0)
    #trim the result array, because the cells among the edges do not have 3*3 average
    >>> averagea1[1:10-1, 1:10-1]
    array([[ 4.77777778,  5.66666667,  4.55555556,  4.33333333,  3.88888889,
         3.66666667,  4.        ,  4.44444444],
       [ 4.88888889,  4.33333333,  4.55555556,  3.77777778,  4.55555556,
         3.22222222,  4.33333333,  4.66666667],
       [ 3.77777778,  3.66666667,  4.33333333,  4.55555556,  5.        ,
         3.33333333,  4.55555556,  4.66666667],
       [ 2.22222222,  2.55555556,  4.22222222,  4.88888889,  5.        ,
         3.33333333,  4.        ,  3.88888889],
       [ 2.11111111,  3.55555556,  5.11111111,  5.33333333,  4.88888889,
         3.88888889,  3.88888889,  3.55555556],
       [ 3.66666667,  5.22222222,  5.        ,  4.        ,  3.33333333,
         3.55555556,  3.11111111,  2.77777778],
       [ 3.77777778,  4.77777778,  4.88888889,  5.11111111,  4.77777778,
         4.77777778,  3.44444444,  3.55555556],
       [ 4.33333333,  5.33333333,  5.55555556,  5.66666667,  5.66666667,
         4.88888889,  3.44444444,  3.66666667]])
    

    I think you don't need to flatten you 2D-array, that causes confusion. Also, if you want to handle the edge elements differently other than just trim them away, consider making masked arrays using np.ma in 'Move your original array around' step.

提交回复
热议问题