Extract outliers from Seaborn Boxplot

隐身守侯 提交于 2020-08-04 18:38:52

问题


Is there a way to extract all outliers after plotting a Seaborn Boxplot? For example, if I am plotting a boxplot for the below data

      client                total
1      LA                     1
2      Sultan                128
3      ElderCare              1
4      CA                     3
5      More                  900

I want to see the below records returned as outliers after the boxplot is plotted.

2      Sultan                128
5      More                  900

回答1:


Seaborn uses matplotlib to handle outlier calculations, meaning the key parameter, whis, is passed onto ax.boxplot. The specific function taking care of the calculation is documented here: https://matplotlib.org/api/cbook_api.html#matplotlib.cbook.boxplot_stats. You can use matplotlib.cbook.boxplot_stats to calculate rather than extract outliers. The follow code snippet shows you the calculation and how it is the same as the seaborn plot:

import matplotlib.pyplot as plt
from matplotlib.cbook import boxplot_stats
import pandas as pd
import seaborn as sns

data = [
    ('LA', 1),
    ('Sultan', 128),
    ('ElderCare', 1),
    ('CA', 3),
    ('More', 900),
]
df = pd.DataFrame(data, columns=('client', 'total'))
ax = sns.boxplot(data=df)
outliers = [y for stat in boxplot_stats(df['total']) for y in stat['fliers']]
print(outliers)
for y in outliers:
    ax.plot(1, y, 'p')
ax.set_xlim(right=1.5)
plt.show()




回答2:


The code below will give you an array of outliers use it to extract values from the dataframe.

from matplotlib.cbook import boxplot_stats  
boxplot_stats(df.colname).pop(0)['fliers']


来源:https://stackoverflow.com/questions/53735603/extract-outliers-from-seaborn-boxplot

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!