问题
I have a dataframe with a Product as a first column, and then 12 month of sales (one column per month). I'd like to 'pivot' the dataframe to end up with a single date index.
example data :
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(10, 1000, size=(2,12)), index=['PrinterBlue', 'PrinterBetter'], columns=pd.date_range('1-1', periods=12, freq='M'))
yielding:
>>> df
2014-01-31 2014-02-28 2014-03-31 2014-04-30 2014-05-31 \
PrinterBlue 176 77 89 279 81
PrinterBetter 801 660 349 608 322
2014-06-30 2014-07-31 2014-08-31 2014-09-30 2014-10-31 \
PrinterBlue 286 831 114 996 904
PrinterBetter 994 374 895 586 646
2014-11-30 2014-12-31
PrinterBlue 458 117
PrinterBetter 366 196
Desired result :
Brand Date Sales
PrinterBlue 2014-01-31 176
2014-02-28 77
2014-03-31 89
[...]
2014-11-30 458
2014-12-31 117
PrinterBetter 2014-01-31 801
2014-02-28 660
2014-03-31 349
[...]
2014-11-30 366
2014-12-31 196
I can imagine getting the result by :
- Building 12 sub dataframe, each containing only one month of information
- Pivoting each dataframe
- Concatenating them
But that seems like an pretty complicated way to make the target transformation. Is there a better / simpler way ?
回答1:
I think pandas melt
provides the functionality you are looking for
http://pandas.pydata.org/pandas-docs/stable/reshaping.html#reshaping-by-melt
import pandas as pd
import numpy as np
from pandas import melt
df = pd.DataFrame(np.random.randint(10, 1000, size=(2,12)), index=['PrinterBlue', 'PrinterBetter'], columns=pd.date_range('1-1', periods=12, freq='M'))
dft = df.T
dft["date"] = dft.index
result = melt(dft, id_vars=["date"])
result.columns = ["date", "brand", "sales"]
print (result)
outputs this:
date brand sales
0 2014-01-31 PrinterBlue 242
1 2014-02-28 PrinterBlue 670
2 2014-03-31 PrinterBlue 142
3 2014-04-30 PrinterBlue 571
4 2014-05-31 PrinterBlue 826
5 2014-06-30 PrinterBlue 515
6 2014-07-31 PrinterBlue 568
7 2014-08-31 PrinterBlue 90
8 2014-09-30 PrinterBlue 652
9 2014-10-31 PrinterBlue 488
10 2014-11-30 PrinterBlue 671
11 2014-12-31 PrinterBlue 767
12 2014-01-31 PrinterBetter 294
13 2014-02-28 PrinterBetter 77
14 2014-03-31 PrinterBetter 59
15 2014-04-30 PrinterBetter 373
16 2014-05-31 PrinterBetter 228
17 2014-06-30 PrinterBetter 708
18 2014-07-31 PrinterBetter 16
19 2014-08-31 PrinterBetter 542
20 2014-09-30 PrinterBetter 577
21 2014-10-31 PrinterBetter 141
22 2014-11-30 PrinterBetter 358
23 2014-12-31 PrinterBetter 290
来源:https://stackoverflow.com/questions/21928814/pandas-dataframe-multiple-time-date-columns-to-single-date-index