Find unique elements of floating point array in numpy (with comparison using a delta value)

后端 未结 4 765
萌比男神i
萌比男神i 2020-12-14 17:06

I\'ve got a ndarray of floating point values in numpy and I want to find the unique values of this array. Of course, this has problems because of floating point

4条回答
  •  不知归路
    2020-12-14 17:35

    Another possibility is to just round to the nearest desirable tolerance:

    np.unique(a.round(decimals=4))
    

    where a is your original array.

    Edit: Just to note that my solution and @unutbu's are nearly identical speed-wise (mine is maybe 5% faster) according to my timings, so either is a good solution.

    Edit #2: This is meant to address Paul's concern. It is definitely slower and there may be some optimizations one can make, but I'm posting it as-is to demonstrate the stratgey:

    def eclose(a,b,rtol=1.0000000000000001e-05, atol=1e-08):
        return np.abs(a - b) <= (atol + rtol * np.abs(b))
    
    x = np.array([6.4,6.500000001, 6.5,6.51])
    y = x.flat.copy()
    y.sort()
    ci = 0
    
    U = np.empty((0,),dtype=y.dtype)
    
    while ci < y.size:
        ii = eclose(y[ci],y)
        mi = np.max(ii.nonzero())
        U = np.concatenate((U,[y[mi]])) 
        ci = mi + 1
    
    print U
    

    This should be decently fast if there are many repeated values within the precision range, but if many of the values are unique, then this is going to be slow. Also, it may be better to set U up as a list and append through the while loop, but that falls under 'further optimization'.

提交回复
热议问题