I'd like to convert a Pandas DataFrame that is derived from a pivot table into a row representation as shown below.
This is where I'm at:
import pandas as pd import numpy as np df = pd.DataFrame({ 'goods': ['a', 'a', 'b', 'b', 'b'], 'stock': [5, 10, 30, 40, 10], 'category': ['c1', 'c2', 'c1', 'c2', 'c1'], 'date': pd.to_datetime(['2014-01-01', '2014-02-01', '2014-01-06', '2014-02-09', '2014-03-09']) }) # we don't care about year in this example df['month'] = df['date'].map(lambda x: x.month) piv = df.pivot_table(["stock"], "month", ["goods", "category"], aggfunc="sum") piv = piv.reindex(np.arange(piv.index[0], piv.index[-1] + 1)) piv = piv.ffill(axis=0) piv = piv.fillna(0) print piv
which results in
stock goods a b category c1 c2 c1 c2 month 1 5 0 30 0 2 5 10 30 40 3 5 10 10 40
And this is where I want to get to.
goods category month stock a c1 1 5 a c1 2 0 a c1 3 0 a c2 1 0 a c2 2 10 a c2 3 0 b c1 1 30 b c1 2 0 b c1 3 10 b c2 1 0 b c2 2 40 b c2 3 0
Previously, I used
piv = piv.stack() piv = piv.reset_index() print piv
to get rid of the multi-indexes, but this results in this because I pivot now on two columns (["goods", "category"]
):
month category stock goods a b 0 1 c1 5 30 1 1 c2 0 0 2 2 c1 5 30 3 2 c2 10 40 4 3 c1 5 10 5 3 c2 10 40
Does anyone know how I can get rid of the multi-index in the column and get the result into a DataFrame of the exemplified format?