Removing duplicate columns and rows from a NumPy 2D array

后端 未结 6 1771
别那么骄傲
别那么骄傲 2020-11-29 04:55

I\'m using a 2D shape array to store pairs of longitudes+latitudes. At one point, I have to merge two of these 2D arrays, and then remove any duplicated entry. I\'ve been se

6条回答
  •  执念已碎
    2020-11-29 05:08

    >>> import numpy as NP
    >>> # create a 2D NumPy array with some duplicate rows
    >>> A
        array([[1, 1, 1, 5, 7],
               [5, 4, 5, 4, 7],
               [7, 9, 4, 7, 8],
               [5, 4, 5, 4, 7],
               [1, 1, 1, 5, 7],
               [5, 4, 5, 4, 7],
               [7, 9, 4, 7, 8],
               [5, 4, 5, 4, 7],
               [7, 9, 4, 7, 8]])
    
    >>> # first, sort the 2D NumPy array row-wise so dups will be contiguous
    >>> # and rows are preserved
    >>> a, b, c, d, e = A.T    # create the keys for to pass to lexsort
    >>> ndx = NP.lexsort((a, b, c, d, e))
    >>> ndx
        array([1, 3, 5, 7, 0, 4, 2, 6, 8])
    >>> A = A[ndx,]
    
    >>> # now diff by row
    >>> A1 = NP.diff(A, axis=0)
    >>> A1
        array([[0, 0, 0, 0, 0],
               [4, 3, 3, 0, 0],
               [0, 0, 0, 0, 0],
               [0, 0, 0, 1, 0],
               [0, 0, 1, 0, 0],
               [2, 5, 0, 2, 1],
               [0, 0, 0, 0, 0],
               [0, 0, 0, 0, 0]])
    
    >>> # the index array holding the location of each duplicate row
    >>> ndx = NP.any(A1, axis=1)  
    >>> ndx
        array([False,  True, False,  True,  True,  True, False, False], dtype=bool)  
    
    >>> # retrieve the duplicate rows:
    >>> A[1:,:][ndx,]
        array([[7, 9, 4, 7, 8],
               [1, 1, 1, 5, 7],
               [5, 4, 5, 4, 7],
               [7, 9, 4, 7, 8]])
    

提交回复
热议问题