问题
I have a Pandas dataframe that contains a column containing 'year' data and a column containing 'count' data. There is also a column containing a 'category' variable. Not each category has data for each year. I would like to plot an array of bar charts, one above the other, using a common x axis (year). The code I've written almost works except the x axis is not common for all plots.
The code example is given below. Basically, the code creates an array of axes with sharex=True and then steps through each axis plotting the relevant data from the dataframe.
# Define dataframe
myDF = pd.DataFrame({'year':list(range(2000,2010))+list(range(2001,2008))+list(range(2005,2010)),
'category':['A']*10 + ['B']*7 + ['C']*5,
'count':[2,3,4,3,4,5,4,3,4,5,2,3,4,5,4,5,6,9,8,7,8,6]})
# Plot counts for individual categories in array of bar charts
fig, axarr = plt.subplots(3, figsize = (4,6), sharex = True)
for i in range(0,len(myDF['category'].unique())):
myDF.loc[myDF['category'] == myDF['category'].unique()[i],['year','count']].plot(kind = 'bar',
ax = axarr[i],
x = 'year',
y = 'count',
legend = False,
title = 'Category {0} bar chart'.format(myDF['category'].unique()[i]))
fig.subplots_adjust(hspace=0.5)
plt.show()
A screenshot of the outcome is given below:
I was expecting the Category A bars to extend from 2000 to 2009, Category B bars to extend from 2001 to 2007 and Category C bars to extend from 2005 to 2009. However, it seems that only the first 5 bars of each category are plotted regardless of the value on the x axis. Presumably, the reason only 5 bars are plotted is because the last category only had data for 5 years. A bigger problem is that the data plotted for the other categories is not plotted against the correct year. I've searched for solutions and tried various modifications but nothing seems to work.
Any suggestions to resolve this issue would be very welcome.
回答1:
Try the following approach:
d = myDF.groupby(['year', 'category'])['count'].sum().unstack()
fig, axarr = plt.subplots(3, figsize = (4,6), sharex=True)
for i, cat in enumerate(d.columns):
d[cat].plot(kind='bar', ax=axarr[i], title='Category {cat} bar chart'.format(cat=cat))
fig.subplots_adjust(hspace=0.5)
来源:https://stackoverflow.com/questions/54327324/plotting-pandas-data-as-an-array-of-bar-chart-does-not-honour-sharex-true