I've got a dataframe outcome2 that I generate a grouped boxplot with in the following manner:
In [11]: outcome2.boxplot(column='Hospital 30-Day Death (Mortality) Rates from Heart Attack',by='State')
plt.ylabel('30 Day Death Rate')
plt.title('30 Day Death Rate by State')
Out [11]:
What I'd like to do is sort the plot by the median for each state, instead of alphabetically. Not sure how to go about doing so.
To sort by the median, just compute the median, then sort it and use the resulting Index to slice the DataFrame:
In [45]: df.iloc[:10, :5]
Out[45]:
AK AL AR AZ CA
0 0.047 0.199 0.969 -0.205 1.053
1 0.206 0.132 -0.712 0.111 -0.254
2 0.638 0.233 -0.907 1.284 1.193
3 1.234 0.046 0.624 0.485 -0.048
4 -1.362 -0.559 1.108 -0.501 0.111
5 1.276 -0.954 0.653 -0.175 -0.287
6 0.524 -1.785 -0.887 1.354 -0.431
7 0.111 0.762 -0.514 0.808 0.728
8 1.301 0.619 0.957 1.542 -0.087
9 -0.892 2.327 1.363 -1.537 0.142
In [46]: med = df.median()
In [47]: med.sort()
In [48]: newdf = df[med.index]
In [49]: newdf.iloc[:10, :5]
Out[49]:
PA CT LA RI MO
0 -0.667 0.774 -0.999 -0.938 0.155
1 0.822 0.390 -0.014 -2.228 0.570
2 -1.037 0.838 -0.673 2.038 0.809
3 0.620 2.845 -0.523 -0.151 -0.955
4 -0.918 1.043 0.613 0.698 -0.446
5 -0.767 0.869 -0.496 -0.925 -0.374
6 -0.495 0.437 1.245 -1.046 0.894
7 -1.283 0.358 0.016 0.137 0.511
8 -0.018 -0.047 -0.639 -0.385 0.080
9 -1.705 0.986 0.605 0.295 0.302
In [50]: med.head()
Out[50]:
PA -0.117
CT -0.077
LA -0.072
RI -0.069
MO -0.053
dtype: float64
The resulting figure:
来源:https://stackoverflow.com/questions/19469568/how-to-sort-a-boxplot-by-the-median-values-in-pandas