问题
I have a data frame like this:
value identifier
2007-01-01 0.781611 55
2007-01-01 0.766152 56
2007-01-01 0.766152 57
2007-02-01 0.705615 55
2007-02-01 0.032134 56
2007-02-01 0.032134 57
2008-01-01 0.026512 55
2008-01-01 0.993124 56
2008-01-01 0.993124 57
2008-02-01 0.226420 55
2008-02-01 0.033860 56
2008-02-01 0.033860 57
So I do a groupby per identifier:
df.groupby('identifier')
And now I want to generate subplots in a grid, one plot per group. I tried both
df.groupby('identifier').plot(subplots=True)
or
df.groupby('identifier').plot(subplots=False)
and
plt.subplots(3,3)
df.groupby('identifier').plot(subplots=True)
to no avail. How can I create the graphs?
回答1:
Here's an automated layout with lots of groups (of random fake data) and playing around with grouped.get_group(key)
will show you how to do more elegant plots.
import pandas as pd
from numpy.random import randint
import matplotlib.pyplot as plt
df = pd.DataFrame(randint(0,10,(200,6)),columns=list('abcdef'))
grouped = df.groupby('a')
rowlength = grouped.ngroups/2 # fix up if odd number of groups
fig, axs = plt.subplots(figsize=(9,4),
nrows=2, ncols=rowlength, # fix as above
gridspec_kw=dict(hspace=0.4)) # Much control of gridspec
targets = zip(grouped.groups.keys(), axs.flatten())
for i, (key, ax) in enumerate(targets):
ax.plot(grouped.get_group(key))
ax.set_title('a=%d'%key)
ax.legend()
plt.show()

回答2:
You do use pivot to get the identifiers
in columns and then plot
pd.pivot_table(df.reset_index(),
index='index', columns='identifier', values='value'
).plot(subplots=True)

And, the output of
pd.pivot_table(df.reset_index(),
index='index', columns='identifier', values='value'
)
Looks like -
identifier 55 56 57
index
2007-01-01 0.781611 0.766152 0.766152
2007-02-01 0.705615 0.032134 0.032134
2008-01-01 0.026512 0.993124 0.993124
2008-02-01 0.226420 0.033860 0.033860
回答3:
If you have a series with multiindex. Here's another solution for the wanted graph.
df.unstack('indentifier').plot.line(subplots=True)
回答4:
Here is a solution to those, who need to plot graphs for exploring different levels of aggregation by multiple columns grouping.
from numpy.random import randint
from numpy.random import randint
import matplotlib.pyplot as plt
import numpy as np
levels_bool = np.tile(np.arange(0,2), 100)
levels_groups = np.repeat(np.arange(0,4), 50)
x_axis = np.tile(np.arange(0,10), 20)
values = randint(0,10,200)
stacked = np.stack((levels_bool, levels_groups, x_axis, values), axis=0)
df = pd.DataFrame(stacked.T, columns=['bool', 'groups', 'x_axis', 'values'])
columns = len(df['bool'].unique())
rows = len(df['groups'].unique())
fig, axs = plt.subplots(rows, columns, figsize = (20,20))
y_index_counter = count(0)
groupped_df = df.groupby([ 'groups', 'bool','x_axis']).agg({
'values': ['min', 'mean', 'median', 'max']
})
for group_name, grp in groupped_df.groupby(['groups']):
y_index = next(y_index_counter)
x_index_counter = count(0)
for boolean, grp2 in grp.groupby(['bool']):
x_index = next(x_index_counter)
axs[y_index, x_index].plot(grp2.reset_index()['x_axis'], grp2.reset_index()['values'],
label=str(key)+str(key2))
axs[y_index, x_index].set_title("Group:{} Bool:{}".format(group_name, boolean))
ax.legend()
plt.subplots_adjust(hspace=0.5)
plt.show()
来源:https://stackoverflow.com/questions/29975835/how-to-create-pandas-groupby-plot-with-subplots