Filter numpy array to retain only one row for a given value

前端 未结 1 1360
太阳男子
太阳男子 2020-12-11 10:40

I have a large n x 2 numpy array that is formatted as (x, y) coordinates. I would like to filter this array so as to:

  1. Identify coordinate pairs with duplicated
相关标签:
1条回答
  • 2020-12-11 10:46

    Here's one way based on np.maximum.reduceat -

    def grouby_maxY(a):
        b = a[a[:,0].argsort()] # if first col is already sorted, skip this
        grp_idx = np.flatnonzero(np.r_[True,(b[:-1,0] != b[1:,0])])
        grp_maxY = np.maximum.reduceat(b[:,1], grp_idx)
        return np.c_[b[grp_idx,0], grp_maxY]
    

    Alternatively, if you want to bring np.unique, we can use it to find grp_idx with np.unique(b[:,0], return_index=1)[1].

    Sample run -

    In [453]: np.random.seed(0)
    
    In [454]: arr = np.random.randint(0,5,(10,2))
    
    In [455]: arr
    Out[455]: 
    array([[4, 0],
           [3, 3],
           [3, 1],
           [3, 2],
           [4, 0],
           [0, 4],
           [2, 1],
           [0, 1],
           [1, 0],
           [1, 4]])
    
    In [456]: grouby_maxY(arr)
    Out[456]: 
    array([[0, 4],
           [1, 4],
           [2, 1],
           [3, 3],
           [4, 0]])
    
    0 讨论(0)
提交回复
热议问题