Plotting the same column from various DataFrames in a Panel

问题

I've got data from a simulation which gives me some values stored in a DataFrame (100 rows x 6 columns). For varying starting values I saved my data in a Panel (2 DataFrames x 100 rows x 6 columns).

Now I want to compare how the column named 'A' in both simulations (DataFrames named 'Sim1' and 'Sim2') compare and one way to do that is via the DataFrame.plot command

Panel['Sim1'].plot(x = 'xvalues', y='A')
Panel['Sim2'].plot(x = 'xvalues', y='A')
plt.show()

This works, but I feel like it somehow should be possible to plot like da data was in the same DataFrame where I could plot like this

DataFrame.plot(x = 'xvalues', y = ['A1', 'A2'])

Am I missing something or is it just impossible to simply plot the two graphs into one figure with one command if the data is stored in a Panel?

回答1:

Consider the following example:

In [77]: import pandas_datareader.data as web

In [78]: p = web.DataReader(['AAPL','GOOGL'], 'yahoo', '2017-01-01')

In [79]: p.axes
Out[79]:
[Index(['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close'], dtype='object'),
 DatetimeIndex(['2017-01-03', '2017-01-04', '2017-01-05', '2017-01-06', '2017-01-09', '2017-01-10', '2017-01-11', '2017-01-12',
                '2017-01-13', '2017-01-17', '2017-01-18', '2017-01-19', '2017-01-20', '2017-01-23', '2017-01-24', '2017-01-25',
                '2017-01-26', '2017-01-27', '2017-01-30', '2017-01-31', '2017-02-01', '2017-02-02', '2017-02-03', '2017-02-06',
                '2017-02-07', '2017-02-08', '2017-02-09', '2017-02-10', '2017-02-13', '2017-02-14', '2017-02-15', '2017-02-16',
                '2017-02-17', '2017-02-21', '2017-02-22', '2017-02-23', '2017-02-24', '2017-02-27', '2017-02-28', '2017-03-01',
                '2017-03-02', '2017-03-03', '2017-03-06', '2017-03-07', '2017-03-08', '2017-03-09', '2017-03-10', '2017-03-13',
                '2017-03-14', '2017-03-15', '2017-03-16', '2017-03-17', '2017-03-20', '2017-03-21', '2017-03-22', '2017-03-23',
                '2017-03-24', '2017-03-27', '2017-03-28', '2017-03-29', '2017-03-30', '2017-03-31', '2017-04-03', '2017-04-04',
                '2017-04-05', '2017-04-06', '2017-04-07', '2017-04-10', '2017-04-11', '2017-04-12', '2017-04-13', '2017-04-17',
                '2017-04-18', '2017-04-19', '2017-04-20', '2017-04-21'],
               dtype='datetime64[ns]', name='Date', freq=None),
 Index(['AAPL', 'GOOGL'], dtype='object')]

In [80]: p.loc['Adj Close']
Out[80]:
                  AAPL       GOOGL
Date
2017-01-03  115.648597  808.010010
2017-01-04  115.519154  807.770020
2017-01-05  116.106611  813.020020
2017-01-06  117.401002  825.210022
2017-01-09  118.476334  827.179993
2017-01-10  118.595819  826.010010
2017-01-11  119.233055  829.859985
2017-01-12  118.735214  829.530029
2017-01-13  118.526121  830.940002
2017-01-17  119.481976  827.460022
...                ...         ...
2017-04-07  143.339996  842.099976
2017-04-10  143.169998  841.700012
2017-04-11  141.630005  839.880005
2017-04-12  141.800003  841.460022
2017-04-13  141.050003  840.179993
2017-04-17  141.830002  855.130005
2017-04-18  141.199997  853.989990
2017-04-19  140.679993  856.510010
2017-04-20  142.440002  860.080017
2017-04-21  142.270004  858.950012

[76 rows x 2 columns]

plot it

In [81]: p.loc['Adj Close'].plot()
Out[81]: <matplotlib.axes._subplots.AxesSubplot at 0xdabfda0>

Examples of different slicing/indexing/selecting for the sample Panel:

In [118]: p
Out[118]:
<class 'pandas.core.panel.Panel'>
Dimensions: 6 (items) x 76 (major_axis) x 2 (minor_axis)
Items axis: Open to Adj Close
Major_axis axis: 2017-01-03 00:00:00 to 2017-04-21 00:00:00
Minor_axis axis: AAPL to GOOGL

By items axis (index):

In [119]: p.loc['Adj Close']
Out[119]:
                  AAPL       GOOGL
Date
2017-01-03  115.648597  808.010010
2017-01-04  115.519154  807.770020
2017-01-05  116.106611  813.020020
2017-01-06  117.401002  825.210022
2017-01-09  118.476334  827.179993
2017-01-10  118.595819  826.010010
2017-01-11  119.233055  829.859985
2017-01-12  118.735214  829.530029
2017-01-13  118.526121  830.940002
2017-01-17  119.481976  827.460022
...                ...         ...
2017-04-07  143.339996  842.099976
2017-04-10  143.169998  841.700012
2017-04-11  141.630005  839.880005
2017-04-12  141.800003  841.460022
2017-04-13  141.050003  840.179993
2017-04-17  141.830002  855.130005
2017-04-18  141.199997  853.989990
2017-04-19  140.679993  856.510010
2017-04-20  142.440002  860.080017
2017-04-21  142.270004  858.950012

[76 rows x 2 columns]

By major axis:

In [120]: p.loc[:, '2017-01-03']
Out[120]:
             Open        High         Low       Close      Volume   Adj Close
AAPL   115.800003  116.330002  114.760002  116.150002  28781900.0  115.648597
GOOGL  800.619995  811.440002  796.890015  808.010010   1959000.0  808.010010

By minor axis:

In [121]: p.loc[:, :, 'GOOGL']
Out[121]:
                  Open        High         Low       Close     Volume   Adj Close
Date
2017-01-03  800.619995  811.440002  796.890015  808.010010  1959000.0  808.010010
2017-01-04  809.890015  813.429993  804.109985  807.770020  1515300.0  807.770020
2017-01-05  807.500000  813.739990  805.919983  813.020020  1340500.0  813.020020
2017-01-06  814.989990  828.960022  811.500000  825.210022  2017100.0  825.210022
2017-01-09  826.369995  830.429993  821.619995  827.179993  1406800.0  827.179993
2017-01-10  827.070007  829.409973  823.140015  826.010010  1194500.0  826.010010
2017-01-11  826.619995  829.900024  821.469971  829.859985  1320200.0  829.859985
2017-01-12  828.380005  830.380005  821.010010  829.530029  1349500.0  829.530029
2017-01-13  831.000000  834.650024  829.520020  830.940002  1288000.0  830.940002
2017-01-17  830.000000  830.179993  823.200012  827.460022  1439700.0  827.460022
...                ...         ...         ...         ...        ...         ...
2017-04-07  845.000000  845.880005  837.299988  842.099976  1110000.0  842.099976
2017-04-10  841.539978  846.739990  840.789978  841.700012  1021200.0  841.700012
2017-04-11  841.700012  844.630005  834.599976  839.880005   971900.0  839.880005
2017-04-12  838.460022  843.719971  837.590027  841.460022  1126100.0  841.460022
2017-04-13  841.039978  843.729980  837.849976  840.179993  1067200.0  840.179993
2017-04-17  841.380005  855.640015  841.030029  855.130005  1044800.0  855.130005
2017-04-18  852.539978  857.390015  851.250000  853.989990   935200.0  853.989990
2017-04-19  857.390015  860.200012  853.530029  856.510010  1077500.0  856.510010
2017-04-20  859.739990  863.929993  857.500000  860.080017  1186900.0  860.080017
2017-04-21  860.619995  862.440002  857.729980  858.950012  1168200.0  858.950012

[76 rows x 6 columns]

In your case (depending on your axes) you may want to slice your Panel differently:

Panel.loc[:, :, 'A'].plot()

回答2:

Here's one approach, using Panel.apply().
The output of apply(plt.plot) is a minor_axis-by-items data frame of Line2D objects. plot() tries to plot an additional dimension that doesn't really make sense for our purposes, but we can use lines.pop() to remove the offending dimension. Hope this helps.

# generate sample data
x = np.arange(20)
y1 = np.random.randint(100, size=20)
y2 = np.random.randint(100, size=20)
data = {'A1': pd.DataFrame({'y':y1,'x':x}),
        'A2': pd.DataFrame({'y':y2,'x':x})}
p = pd.Panel(data)

# plot panels
df = p.apply(plt.plot)
df.ix[0,0].axes.lines.pop(2)
df.ix[0,0].axes.lines.pop(0)
df.ix[0,0].axes.legend(loc="lower right")

来源：https://stackoverflow.com/questions/43519643/plotting-the-same-column-from-various-dataframes-in-a-panel

标签

python

pandas

dataframe

plot

panel