Creating percentage stacked bar chart using groupby

大兔子大兔子 提交于 2021-02-19 06:20:06

问题


I'm looking at home ownership within levels of different loan statuses, and I'd like to display this using a stacked bar chart in percentages.

I've been able to create a frequency stacked bar chart using this code:

df_trunc1=df[['loan_status','home_ownership','id']]
sub_df1=df_trunc1.groupby(['loan_status','home_ownership'])['id'].count()
sub_df1.unstack().plot(kind='bar',stacked=True,rot=1,figsize=(8,8),title="Home ownership across Loan Types")

which gives me this picture:1

but I can't figure out how to transform the graph into percentages. So for example, I'd like to get within the default group, which percentage have a mortgage, which own, etc.

Here is my groupby table for context2:

Thanks!!


回答1:


I believe you need to convert the percentages yourself:

d = {('Default', 'MORTGAGE'): 498, ('Default', 'OWN'): 110, ('Default', 'RENT'): 611, ('Fully Paid', 'MORTGAGE'): 3100, ('Fully Paid', 'NONE'): 1, ('Fully Paid', 'OTHER'): 5, ('Fully Paid', 'OWN'): 558, ('Fully Paid', 'RENT'): 2568, ('Late (16-30 days)', 'MORTGAGE'): 1101, ('Late (16-30 days)', 'OWN'): 260, ('Late (16-30 days)', 'RENT'): 996, ('Late (31-120 days)', 'MORTGAGE'): 994, ('Late (31-120 days)', 'OWN'): 243, ('Late (31-120 days)', 'RENT'): 1081}

sub_df1 = pd.DataFrame(d.values(), columns=['count'], index=pd.MultiIndex.from_tuples(d.keys()))
sub_df2 = sub_df1.unstack()
sub_df2.columns = sub_df2.columns.droplevel()  # Drop `count` label.
sub_df2 = sub_df2.div(sub_df2.sum())
sub_df2.T.plot(kind='bar', stacked=True, rot=1, figsize=(8, 8), 
               title="Home ownership across Loan Types")

sub_df3 = sub_df1.unstack().T
sub_df3.index = sub_df3.index.droplevel()  # Drop `count` label.
sub_df3 = sub_df3.div(sub_df3.sum())
sub_df3.T.plot(kind='bar', stacked=True, rot=1, figsize=(8, 8), 
               title="Home ownership across Loan Types")




回答2:


I calculated the percentage by transposing the dataframe twice. Did it step by step to show the logic more explicitly.

#transpose
to_plot =sub_df1.unstack()
to_plot_transpose = to_plot.transpose()

#calc %
to_plot_transpose_pct = to_plot_transpose.div(to_plot_transpose.sum())

#transpose back
to_plot_pct=to_plot_transpose_pct.transpose()

#plot
to_plot_pct.plot(kind='bar',stacked=True,rot=1,figsize= . 
  (8,8),title="Home ownership across Loan Types")


来源:https://stackoverflow.com/questions/40433876/creating-percentage-stacked-bar-chart-using-groupby

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!