stack bar plot in matplotlib and add label to each section

后端 未结 2 940
眼角桃花
眼角桃花 2020-11-27 05:59

I am trying to replicate the following image in matplotlib and it seems barh is my only option. Though it appears that you can\'t stack barh graphs

2条回答
  •  难免孤独
    2020-11-27 06:16

    • The easies way to plot a horizontal or vertical stacked bar, is to load the data into a pandas.DataFrame
      • This will plot, and annotate correctly, even when all categories ('People'), don't have all segments (e.g. some value is 0 or NaN)
    • Once the data is in the dataframe:
      1. It's easier to manipulate and analyze
      2. It can be plotted with the matplotlib engine, using:
        • pandas.DataFrame.plot.barh
          • label_text = f'{width}' for annotations
        • pandas.DataFrame.plot.bar
          • label_text = f'{height}' for annotations
          • SO: Vertical Stacked Bar Chart with Centered Labels
    • These methods return a matplotlib.axes.Axes or a numpy.ndarray of them.
    • Using the .patches method unpacks a list of matplotlib.patches.Rectangle objects, one for each of the sections of the stacked bar.
      • Each .Rectangle has methods for extracting the various values that define the rectangle.
      • Each .Rectangle is in order from left the right, and bottom to top, so all the .Rectangle objects, for each level, appear in order, when iterating through .patches.
    • The labels are made using an f-string, label_text = f'{width:.2f}%', so any additional text can be added as needed.

    Create a DataFrame

    import pandas as pd
    import numpy as np
    
    # create sample data as shown in the OP
    np.random.seed(365)
    people = ('A','B','C','D','E','F','G','H')
    bottomdata = 3 + 10 * np.random.rand(len(people))
    topdata = 3 + 10 * np.random.rand(len(people))
    
    # create the dataframe
    df = pd.DataFrame({'Female': bottomdata, 'Male': topdata}, index=people)
    
    # display(df)
       Female   Male
    A   12.41   7.42
    B    9.42   4.10
    C    9.85   7.38
    D    8.89  10.53
    E    8.44   5.92
    F    6.68  11.86
    G   10.67  12.97
    H    6.05   7.87
    

    Plot and Annotate

    • Plotting the bar, is 1 line, the remainder is annotating the rectangles
    # plot the dataframe with 1 line
    ax = df.plot.barh(stacked=True, figsize=(8, 6))
    
    # .patches is everything inside of the chart
    for rect in ax.patches:
        # Find where everything is located
        height = rect.get_height()
        width = rect.get_width()
        x = rect.get_x()
        y = rect.get_y()
        
        # The height of the bar is the data value and can be used as the label
        label_text = f'{width:.2f}%'  # f'{width:.2f}' to format decimal values
        
        # ax.text(x, y, text)
        label_x = x + width / 2
        label_y = y + height / 2
        
        # only plot labels greater than given width
        if width > 0:
            ax.text(label_x, label_y, label_text, ha='center', va='center', fontsize=8)
    
    # move the legend
    ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left', borderaxespad=0.)
    
    # add labels
    ax.set_ylabel("People", fontsize=18)
    ax.set_xlabel("Percent", fontsize=18)
    plt.show()
    

    Example with Missing Segment

    # set one of the dataframe values to 0
    df.iloc[4, 1] = 0
    
    • Note the annotations are all in the correct location from df.

提交回复
热议问题