Pythonic way of detecting outliers in one dimensional observation data
For the given data, I want to set the outlier values (defined by 95% confidense level or 95% quantile function or anything that is required) as nan values. Following is the my data and code that I am using right now. I would be glad if someone could explain me further. import numpy as np, matplotlib.pyplot as plt data = np.random.rand(1000)+5.0 plt.plot(data) plt.xlabel('observation number') plt.ylabel('recorded value') plt.show() Joe Kington The problem with using percentile is that the points identified as outliers is a function of your sample size. There are a huge number of ways to test