Indicating the statistically significant difference in bar graph

后端 未结 2 1044
遥遥无期
遥遥无期 2020-12-08 03:24

I use a bar graph to indicate the data of each group. Some of these bars differ significantly from each other. How can I indicate the significant difference in the bar plot?

相关标签:
2条回答
  • 2020-12-08 03:58

    The answer above inspired me to write a small but flexible function myself:

    def barplot_annotate_brackets(num1, num2, data, center, height, yerr=None, dh=.05, barh=.05, fs=None, maxasterix=None):
        """ 
        Annotate barplot with p-values.
    
        :param num1: number of left bar to put bracket over
        :param num2: number of right bar to put bracket over
        :param data: string to write or number for generating asterixes
        :param center: centers of all bars (like plt.bar() input)
        :param height: heights of all bars (like plt.bar() input)
        :param yerr: yerrs of all bars (like plt.bar() input)
        :param dh: height offset over bar / bar + yerr in axes coordinates (0 to 1)
        :param barh: bar height in axes coordinates (0 to 1)
        :param fs: font size
        :param maxasterix: maximum number of asterixes to write (for very small p-values)
        """
    
        if type(data) is str:
            text = data
        else:
            # * is p < 0.05
            # ** is p < 0.005
            # *** is p < 0.0005
            # etc.
            text = ''
            p = .05
    
            while data < p:
                text += '*'
                p /= 10.
    
                if maxasterix and len(text) == maxasterix:
                    break
    
            if len(text) == 0:
                text = 'n. s.'
    
        lx, ly = center[num1], height[num1]
        rx, ry = center[num2], height[num2]
    
        if yerr:
            ly += yerr[num1]
            ry += yerr[num2]
    
        ax_y0, ax_y1 = plt.gca().get_ylim()
        dh *= (ax_y1 - ax_y0)
        barh *= (ax_y1 - ax_y0)
    
        y = max(ly, ry) + dh
    
        barx = [lx, lx, rx, rx]
        bary = [y, y+barh, y+barh, y]
        mid = ((lx+rx)/2, y+barh)
    
        plt.plot(barx, bary, c='black')
    
        kwargs = dict(ha='center', va='bottom')
        if fs is not None:
            kwargs['fontsize'] = fs
    
        plt.text(*mid, text, **kwargs)
    

    which allows me to get some nice annotations relatively simple, e.g.:

    heights = [1.8, 2, 3]
    bars = np.arange(len(heights))
    
    plt.figure()
    plt.bar(bars, heights, align='center')
    plt.ylim(0, 5)
    barplot_annotate_brackets(0, 1, .1, bars, heights)
    barplot_annotate_brackets(1, 2, .001, bars, heights)
    barplot_annotate_brackets(0, 2, 'p < 0.0075', bars, heights, dh=.2)
    

    0 讨论(0)
  • 2020-12-08 03:59

    I've done a couple of things here that I suggest when working with complex plots. Pull out the custom formatting into a dictionary, it makes life simple when you want to change a parameter - and you can pass this dictionary to multiple plots. I've also written a custom function to annotate the itervalues, as a bonus it can annotate between (A,C) if you really want to (I stand by my comment that this isn't the right visual approach however). It may need some tweaking once the data changes but this should put you on the right track.

    import numpy as np
    import matplotlib.pyplot as plt
    menMeans   = (5, 15, 30, 40)
    menStd     = (2, 3, 4, 5)
    ind  = np.arange(4)    # the x locations for the groups
    width= 0.7
    labels = ('A', 'B', 'C', 'D')
    
    # Pull the formatting out here
    bar_kwargs = {'width':width,'color':'y','linewidth':2,'zorder':5}
    err_kwargs = {'zorder':0,'fmt':None,'linewidth':2,'ecolor':'k'}  #for matplotlib >= v1.4 use 'fmt':'none' instead
    
    fig, ax = plt.subplots()
    ax.p1 = plt.bar(ind, menMeans, **bar_kwargs)
    ax.errs = plt.errorbar(ind, menMeans, yerr=menStd, **err_kwargs)
    
    
    # Custom function to draw the diff bars
    
    def label_diff(i,j,text,X,Y):
        x = (X[i]+X[j])/2
        y = 1.1*max(Y[i], Y[j])
        dx = abs(X[i]-X[j])
    
        props = {'connectionstyle':'bar','arrowstyle':'-',\
                     'shrinkA':20,'shrinkB':20,'linewidth':2}
        ax.annotate(text, xy=(X[i],y+7), zorder=10)
        ax.annotate('', xy=(X[i],y), xytext=(X[j],y), arrowprops=props)
    
    # Call the function
    label_diff(0,1,'p=0.0370',ind,menMeans)
    label_diff(1,2,'p<0.0001',ind,menMeans)
    label_diff(2,3,'p=0.0025',ind,menMeans)
    
    
    plt.ylim(ymax=60)
    plt.xticks(ind, labels, color='k')
    plt.show()
    

    enter image description here

    0 讨论(0)
提交回复
热议问题