Transposing Data in Pandas

只谈情不闲聊 提交于 2021-02-11 15:15:55

问题


I have an excel file that contains a count of how many times an Part has been used during its lifespan. The data is currently stored in such a way that the Serial numbers are in Column A, and each "Lifespan" Count is stored in adjacent columns, with a "Date" value as its heading.
Here is an example:

Image1

I want to be able to pivot/transpose ALL of the Date columns in Python so that the output is in the following format with the Lifespan Count as a new column named "Count":

Image2

I've tried using the pivot function in pandas to flip the appropriate rows & columns around, however the problem is that I'd have to define each column one at a time. The nature of the spreadsheet used in this example is just one of several spreadsheets I use that all start at a different Date (some start at 01/01/2010 to present while others start at 01/01/1999 to present). It is worth noting that all spreadsheets start at the first of every month and increment 1 month at a time.

Therefore is there a way I can just select ALL the columns I've loaded into a Dataframe via the spreadsheet loaded to be able to Pivot/Transpose the data as mentioned?

pivot = df.pivot_table(index=['Serial'], values=['01/01/2019','01/02/2019',... etc], aggfunc='max'])

回答1:


Simple use of stack() if you have named your column series and have set_index() on your rows.

import datetime as dt
import random
df = pd.DataFrame({**{"Serial":[s for s in range(1111,1121)]}, 
 **{d:[random.randint(0,150) for i in range(10)] for d in pd.date_range(start=dt.date(2019,1,1), end=dt.date(2019,1,6))}
}).set_index("Serial")
df.columns.set_names("Date", inplace=True)
dfs = df.stack().to_frame().rename(columns={0:"Count"})

print(f"{df.iloc[:,:3].to_string()}\n\n{dfs.iloc[:10].to_string()}")

sample output

Date    2019-01-01 00:00:00  2019-01-02 00:00:00  2019-01-03 00:00:00
Serial                                                               
1111                    134                   76                    8
1112                    115                   37                    8
1113                     16                   23                  130
1114                    108                   45                   69
1115                      0                   51                  111
1116                     51                   71                  101
1117                    107                   10                  142
1118                     80                   48                  134
1119                    148                   62                   75
1120                      4                  135                  100

                            Count
Serial Date                      
1111   2019-01-01 00:00:00    134
       2019-01-02 00:00:00     76
       2019-01-03 00:00:00      8
       2019-01-04 00:00:00     16
       2019-01-05 00:00:00    140
       2019-01-06 00:00:00     28
1112   2019-01-01 00:00:00    115
       2019-01-02 00:00:00     37
       2019-01-03 00:00:00      8
       2019-01-04 00:00:00     56


来源:https://stackoverflow.com/questions/63377376/transposing-data-in-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!