Annotate stacked barplot matplotlib and pandas [duplicate]

 ̄綄美尐妖づ 提交于 2019-12-12 09:06:49

问题


I have a simple Data Frame that stores the results of a survey. The columns are:

| Age | Income | Satisfaction |

all of them contains values between 1 and 5 (categorical). I managed to generate a stacked barplot that shows distribution of Satisfaction values across people of different age. The code is:

#create a random df
data = []
for i in range(500):
    sample = {"age" : random.randint(0,5), "income" : random.randint(1,5), "satisfaction" : random.randint(1,5)}
data.append(sample)
df = pd.DataFrame(data)
#group by age
counter = df.groupby('age')['satisfaction'].value_counts().unstack()
#calculate the % for each age group 
percentage_dist = 100 * counter.divide(counter.sum(axis = 1), axis = 0)
percentage_dist.plot.bar(stacked=True)

This generates the following, desired, plot:

However, it's difficult to compare if the green subset (percentage) of Age-0 is higher than the one in Age-2. Therefore, is there a way of adding the percentage on top of each sub-section of the barplot. Something like this, but for every single bar:


回答1:


One option is to iterate over the patches in order to obtain their width, height and bottom-left coordinates and use this values to place the label at the center of the corresponding bar.

To do this, the axes returned by the pandas bar method must be stored.

ax = percentage_dist.plot.bar(stacked=True)
for p in ax.patches:
    width, height = p.get_width(), p.get_height()
    x, y = p.get_xy() 
    ax.text(x+width/2, 
            y+height/2, 
            '{:.0f} %'.format(height), 
            horizontalalignment='center', 
            verticalalignment='center')

Here, the annotated value is set to 0 decimals, but this can be easily modified.

The output plot generated with this code is the following:



来源:https://stackoverflow.com/questions/50160788/annotate-stacked-barplot-matplotlib-and-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!