I am trying to replicate the following image in matplotlib and it seems barh
is my only option. Though it appears that you can\'t stack barh
graphs
pandas.DataFrame
'People'
), don't have all segments (e.g. some value is 0 or NaN
)matplotlib
engine, using:
label_text = f'{width}'
for annotationslabel_text = f'{height}'
for annotationsmatplotlib.axes.Axes
or a numpy.ndarray
of them..patches
method unpacks a list of matplotlib.patches.Rectangle objects, one for each of the sections of the stacked bar.
.Rectangle
has methods for extracting the various values that define the rectangle..Rectangle
is in order from left the right, and bottom to top, so all the .Rectangle
objects, for each level, appear in order, when iterating through .patches
.label_text = f'{width:.2f}%'
, so any additional text can be added as needed.import pandas as pd
import numpy as np
# create sample data as shown in the OP
np.random.seed(365)
people = ('A','B','C','D','E','F','G','H')
bottomdata = 3 + 10 * np.random.rand(len(people))
topdata = 3 + 10 * np.random.rand(len(people))
# create the dataframe
df = pd.DataFrame({'Female': bottomdata, 'Male': topdata}, index=people)
# display(df)
Female Male
A 12.41 7.42
B 9.42 4.10
C 9.85 7.38
D 8.89 10.53
E 8.44 5.92
F 6.68 11.86
G 10.67 12.97
H 6.05 7.87
# plot the dataframe with 1 line
ax = df.plot.barh(stacked=True, figsize=(8, 6))
# .patches is everything inside of the chart
for rect in ax.patches:
# Find where everything is located
height = rect.get_height()
width = rect.get_width()
x = rect.get_x()
y = rect.get_y()
# The height of the bar is the data value and can be used as the label
label_text = f'{width:.2f}%' # f'{width:.2f}' to format decimal values
# ax.text(x, y, text)
label_x = x + width / 2
label_y = y + height / 2
# only plot labels greater than given width
if width > 0:
ax.text(label_x, label_y, label_text, ha='center', va='center', fontsize=8)
# move the legend
ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left', borderaxespad=0.)
# add labels
ax.set_ylabel("People", fontsize=18)
ax.set_xlabel("Percent", fontsize=18)
plt.show()
# set one of the dataframe values to 0
df.iloc[4, 1] = 0
df
.