Is there a numpy builtin to reject outliers from a list

后端 未结 10 773
孤城傲影
孤城傲影 2020-11-28 18:00

Is there a numpy builtin to do something like the following? That is, take a list d and return a list filtered_d with any outlying elements removed

10条回答
  •  一向
    一向 (楼主)
    2020-11-28 19:00

    Something important when dealing with outliers is that one should try to use estimators as robust as possible. The mean of a distribution will be biased by outliers but e.g. the median will be much less.

    Building on eumiro's answer:

    def reject_outliers(data, m = 2.):
        d = np.abs(data - np.median(data))
        mdev = np.median(d)
        s = d/mdev if mdev else 0.
        return data[s

    Here I have replace the mean with the more robust median and the standard deviation with the median absolute distance to the median. I then scaled the distances by their (again) median value so that m is on a reasonable relative scale.

    Note that for the data[s syntax to work, data must be a numpy array.

提交回复
热议问题