Sort invariant for numpy.argsort with multiple dimensions

前端 未结 3 1388
谎友^
谎友^ 2020-11-30 11:50

numpy.argsort docs state

Returns:
index_array : ndarray, int Array of indices that sort a along the specified axis. If a is one-dimensional, <

相关标签:
3条回答
  • 2020-11-30 12:23

    The numpy issue #8708 has a sample implementation of take_along_axis that does what I need; I'm not sure if it's efficient for large arrays but it seems to work.

    def take_along_axis(arr, ind, axis):
        """
        ... here means a "pack" of dimensions, possibly empty
    
        arr: array_like of shape (A..., M, B...)
            source array
        ind: array_like of shape (A..., K..., B...)
            indices to take along each 1d slice of `arr`
        axis: int
            index of the axis with dimension M
    
        out: array_like of shape (A..., K..., B...)
            out[a..., k..., b...] = arr[a..., inds[a..., k..., b...], b...]
        """
        if axis < 0:
           if axis >= -arr.ndim:
               axis += arr.ndim
           else:
               raise IndexError('axis out of range')
        ind_shape = (1,) * ind.ndim
        ins_ndim = ind.ndim - (arr.ndim - 1)   #inserted dimensions
    
        dest_dims = list(range(axis)) + [None] + list(range(axis+ins_ndim, ind.ndim))
    
        # could also call np.ix_ here with some dummy arguments, then throw those results away
        inds = []
        for dim, n in zip(dest_dims, arr.shape):
            if dim is None:
                inds.append(ind)
            else:
                ind_shape_dim = ind_shape[:dim] + (-1,) + ind_shape[dim+1:]
                inds.append(np.arange(n).reshape(ind_shape_dim))
    
        return arr[tuple(inds)]
    

    which yields

    >>> A = np.array([[3,2,1],[4,0,6]])
    >>> B = np.array([[3,1,4],[1,5,9]])
    >>> i = A.argsort(axis=-1)
    >>> take_along_axis(A,i,axis=-1)
    array([[1, 2, 3],
           [0, 4, 6]])
    >>> take_along_axis(B,i,axis=-1)
    array([[4, 1, 3],
           [5, 1, 9]])
    
    0 讨论(0)
  • We just need to use advanced-indexing to index along all axes with those indices array. We can use np.ogrid to create open grids of range arrays along all axes and then replace only for the input axis with the input indices. Finally, index into data array with those indices for the desired output. Thus, essentially, we would have -

    # Inputs : arr, ind, axis
    idx = np.ogrid[tuple(map(slice, ind.shape))]
    idx[axis] = ind
    out = arr[tuple(idx)]
    

    Just to make it functional and do error checks, let's create two functions - One to get those indices and second one to feed in the data array and simply index. The idea with the first function is to get the indices that could be re-used for indexing into any arbitrary array which would support the necessary number of dimensions and lengths along each axis.

    Hence, the implementations would be -

    def advindex_allaxes(ind, axis):
        axis = np.core.multiarray.normalize_axis_index(axis,ind.ndim)
        idx = np.ogrid[tuple(map(slice, ind.shape))]
        idx[axis] = ind
        return tuple(idx)
    
    def take_along_axis(arr, ind, axis):
        return arr[advindex_allaxes(ind, axis)]
    

    Sample runs -

    In [161]: A = np.array([[3,2,1],[4,0,6]])
    
    In [162]: B = np.array([[3,1,4],[1,5,9]])
    
    In [163]: i = A.argsort(axis=-1)
    
    In [164]: take_along_axis(A,i,axis=-1)
    Out[164]: 
    array([[1, 2, 3],
           [0, 4, 6]])
    
    In [165]: take_along_axis(B,i,axis=-1)
    Out[165]: 
    array([[4, 1, 3],
           [5, 1, 9]])
    

    Relevant one.

    0 讨论(0)
  • 2020-11-30 12:44

    This argsort produces a (3,2) array

    In [453]: idx=np.argsort(A,axis=-1)
    In [454]: idx
    Out[454]: 
    array([[0, 1],
           [1, 0],
           [0, 1]], dtype=int32)
    

    As you note applying this to A to get the equivalent of np.sort(A, axis=-1) isn't obvious. The iterative solution is sort each row (a 1d case) with:

    In [459]: np.array([x[i] for i,x in zip(idx,A)])
    Out[459]: 
    array([[-1.0856306 ,  0.99734545],
           [-1.50629471,  0.2829785 ],
           [-0.57860025,  1.65143654]])
    

    While probably not the fastest, it is probably the clearest solution, and a good starting point for conceptualizing a better solution.

    The tuple(inds) from the take solution is:

    (array([[0],
            [1],
            [2]]), 
     array([[0, 1],
            [1, 0],
            [0, 1]], dtype=int32))
    In [470]: A[_]
    Out[470]: 
    array([[-1.0856306 ,  0.99734545],
           [-1.50629471,  0.2829785 ],
           [-0.57860025,  1.65143654]])
    

    In other words:

    In [472]: A[np.arange(3)[:,None], idx]
    Out[472]: 
    array([[-1.0856306 ,  0.99734545],
           [-1.50629471,  0.2829785 ],
           [-0.57860025,  1.65143654]])
    

    The first part is what np.ix_ would construct, but it does not 'like' the 2d idx.


    Looks like I explored this topic a couple of years ago

    argsort for a multidimensional ndarray

    a[np.arange(np.shape(a)[0])[:,np.newaxis], np.argsort(a)]
    

    I tried to explain what is going on. The take function does the same sort of thing, but constructs the indexing tuple for a more general case (dimensions and axis). Generalizing to more dimensions, but still with axis=-1 should be easy.

    For the first axis, A[np.argsort(A,axis=0),np.arange(2)] works.

    0 讨论(0)
提交回复
热议问题