Find unique elements of floating point array in numpy (with comparison using a delta value)

后端 未结 4 756
萌比男神i
萌比男神i 2020-12-14 17:06

I\'ve got a ndarray of floating point values in numpy and I want to find the unique values of this array. Of course, this has problems because of floating point

4条回答
  •  無奈伤痛
    2020-12-14 17:45

    Doesn't floor and round both fail the OP's requirement in some cases?

    np.floor([5.99999999, 6.0]) # array([ 5.,  6.])
    np.round([6.50000001, 6.5], 0) #array([ 7.,  6.])
    

    The way I would do it is (and this may not be optimal (and is surely slower than other answers)) something like this:

    import numpy as np
    TOL = 1.0e-3
    a = np.random.random((10,10))
    i = np.argsort(a.flat)
    d = np.append(True, np.diff(a.flat[i]))
    result = a.flat[i[d>TOL]]
    

    Of course this method will exclude all but the largest member of a run of values that come within the tolerance of any other value, which means you may not find any unique values in an array if all values are significantly close even though the max-min is larger than the tolerance.

    Here is essentially the same algorithm, but easier to understand and should be faster as it avoids an indexing step:

    a = np.random.random((10,))
    b = a.copy()
    b.sort()
    d = np.append(True, np.diff(b))
    result = b[d>TOL]
    

    The OP may also want to look into scipy.cluster (for a fancy version of this method) or numpy.digitize (for a fancy version of the other two methods)

提交回复
热议问题