问题
I have a dataset with a categorical variable that contains three unique values, "low", "medium" and "high":
df.CatVar.value_counts()
Out[93]:
Medium 35832
Low 25311
High 12527
Name: CatVar, dtype: int64
I am trying to plot the number of unique values as a bar-plot. However, the following code gives me the bars in the order ["Medium", "Low", "High"]
df.CatVar.value_counts().plot(kind="bar")
How do I change the order of the bars in the plot?
回答1:
There are 2 possible solutions - change order of index
before plot - by reindex or loc
:
df.CatVar.value_counts().reindex(["Low", "Medium", "High"]).plot(kind="bar")
df.CatVar.value_counts().loc[["Low", "Medium", "High"]].plot(kind="bar")
Or use ordered categorical, so after value_counts
get order by categories
parameter:
df.CatVar = pd.Categorical(df.CatVar, categories=["Low", "Medium", "High"], ordered=True)
df.CatVar.value_counts(sort=False).plot(kind="bar")
Sample:
df = pd.DataFrame({'CatVar':['Low','Medium','Low','Low','Medium','High']})
print (df)
CatVar
0 Low
1 Medium
2 Low
3 Low
4 Medium
5 High
df.CatVar.value_counts().reindex(["Low", "Medium", "High"]).plot(kind="bar")
回答2:
The following code solved my problem:
df.CatVar.value_counts()[['Low', 'Medium', 'High']].plot(kind="bar")
回答3:
If you do not mind using seaborn
, you can use countplot and it has parameter to pass the order
:
import seaborn as sns
df = pd.DataFrame({'CatVar':['Low','High','Low','Low','Medium']})
sns.countplot(x='CatVar', data=df, order=['Low', 'Medium', 'High']);
来源:https://stackoverflow.com/questions/50553698/pandas-plot-bar-order-categories