Matplotlib histogram with collection bin for high values

前端 未结 2 2054
情话喂你
情话喂你 2020-12-13 20:25

I have an array with values, and I want to create a histogram of it. I am mainly interested in the low end numbers, and want to collect every number above 300 in one bin. Th

2条回答
  •  暗喜
    暗喜 (楼主)
    2020-12-13 20:35

    Sorry I am not familiar with matplotlib. So I have a dirty hack for you. I just put all values that greater than 300 in one bin and changed the bin size.

    The root of the problem is that matplotlib tries to put all bins on the plot. In R I would convert my bins to factor variable, so they are not treated as real numbers.

    import matplotlib.pyplot as plt
    import numpy as np
    
    def plot_histogram_01():
        np.random.seed(1)
        values_A = np.random.choice(np.arange(600), size=200, replace=True).tolist()
        values_B = np.random.choice(np.arange(600), size=200, replace=True).tolist()
        values_A_to_plot = [301 if i > 300 else i for i in values_A]
        values_B_to_plot = [301 if i > 300 else i for i in values_B]
    
        bins = [0, 25, 50, 75, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325]
    
        fig, ax = plt.subplots(figsize=(9, 5))
        _, bins, patches = plt.hist([values_A_to_plot, values_B_to_plot], normed=1,  # normed is deprecated and will be replaced by density
                                    bins=bins,
                                    color=['#3782CC', '#AFD5FA'],
                                    label=['A', 'B'])
    
        xlabels = np.array(bins[1:], dtype='|S4')
        xlabels[-1] = '300+'
    
        N_labels = len(xlabels)
    
        plt.xticks(25 * np.arange(N_labels) + 12.5)
        ax.set_xticklabels(xlabels)
    
        plt.yticks([])
        plt.title('')
        plt.setp(patches, linewidth=0)
        plt.legend()
    
        fig.tight_layout()
        plt.savefig('my_plot_01.png')
        plt.close()
    
    plot_histogram_01()
    

    enter image description here

提交回复
热议问题