Matplotlib boxplot show only max and min fliers

99封情书 提交于 2019-12-21 13:08:13

问题


I am making standard Matplotlib boxplots using the plt.boxplot() command. My line of code that creates the boxplot is:

bp = plt.boxplot(data, whis=[5, 95], showfliers=True)

Because my data has a large distribution, I am getting a lot of fliers outside the range of the whiskers. To get a cleaner publication quality plot, I would like to only show single fliers at the max. and at the min. values of the data, instead of all fliers. Is this possible? I don't see any built-in options in the documentation to do this.

(I can set the range of the whiskers to max/min, but this is not what I want. I would like to keep the whiskers at the 5th and 95th percentiles).

Below is the figure I am working on. Notice the density of fliers.


回答1:


plt.boxplot() returns a dictionary, where the key fliers contains the upper and lower fliers as line2d objects. You can manipulate them before plotting like this:

Only on matplotlib >= 1.4.0

bp = plt.boxplot(data, whis=[5, 95], showfliers=True)

# Get a list of Line2D objects, representing a single line from the
# minimum to the maximum flier points.
fliers = bp['fliers']

# Iterate over it!
for fly in fliers:
    fdata = fly.get_data()
    fly.set_data([fdata[0][0],fdata[0][-1]],[fdata[1][0],fdata[1][-1]])

On older versions

If you are on an older version of matplotlib, the fliers for each boxplot are represented by two lines, not one. Thus, the loop would look something like this:

import numpy as np
for i in range(len(fliers)):
    fdata = fliers[i].get_data()
    # Get the index of the maximum y in data if 
    # i is 0 or even, else get index of minimum y.
    if i%2 == 0:
        id = np.where(fdata[1] == fdata[1].max())[0][0]
    else:
        id = np.where(fdata[1] == fdata[1].min())[0][0]
    fliers[i].set_data([fdata[0][id], fdata[1][id]])

Also note that the showfliers argument doesn't exist in matplotlib <1.4x and the whisk argument doesn't accept lists.

Of course (for simple applications) you could plot the boxplot without fliers and add the max and min points to the plot:

bp = plt.boxplot(data, whis=[5, 95], showfliers=False)
sc = plt.scatter([1, 1], [data.min(), data.max()])

where [1, 1] is the x-position of the points.




回答2:


fliers = bp['fliers'] 
for i in range(len(fliers)): # iterate through the Line2D objects for the fliers for each boxplot
    box = fliers[i] # this accesses the x and y vectors for the fliers for each box 
    box.set_data([[box.get_xdata()[0],box.get_xdata()[0]],[np.min(box.get_ydata()),‌​np.max(box.get_ydata())]]) 
    # note that you can use any two values from the xdata vector

Resulting figure, showing only max and min fliers:



来源:https://stackoverflow.com/questions/28521828/matplotlib-boxplot-show-only-max-and-min-fliers

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!