Finding outliers in a data set

前端未结

关注

 4  1472

说谎 2021-02-07 07:38

I have a python script that creates a list of lists of server uptime and performance data, where each sub-list (or \'row\') contains a particular cluster\'s stats. For example,

4条回答

半阙折子戏 (楼主)

2021-02-07 07:44

I think your best bet is to have a look into the scipy's scoreatpercentile function. So for instance you could try excluding all the values that are above the 99th percentile.

Mean and standard deviation are no good if you don't have a normal distribution.

Generally it's good to have a rough visual idea of what your data looks like. There is matplotlib; I recommend you make some plots of your data with it before deciding on a plan.

0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...