how to unstack (or pivot?) in pandas

前端 未结 3 741
Happy的楠姐
Happy的楠姐 2020-12-14 03:39

I have a dataframe that looks like the following:

import pandas as pd
datelisttemp = pd.date_range(\'1/1/2014\', periods=3, freq=\'D\')
s = list(datelisttemp         


        
相关标签:
3条回答
  • 2020-12-14 04:03

    We want values (e.g. 'GERMANY') to become column names, and column names (e.g. 'HOUR1') to become values -- a swap of sorts.

    The stack method turns column names into index values, and the unstack method turns index values into column names.

    So by shifting the values into the index, we can use stack and unstack to perform the swap.

    import pandas as pd
    
    datelisttemp = pd.date_range('1/1/2014', periods=3, freq='D')
    s = list(datelisttemp)*3
    s.sort()
    df = pd.DataFrame({'BORDER':['GERMANY','FRANCE','ITALY','GERMANY','FRANCE','ITALY','GERMANY','FRANCE','ITALY' ], 'HOUR1':[2 ,2 ,2 ,4 ,4 ,4 ,6 ,6, 6],'HOUR2':[3 ,3 ,3, 5 ,5 ,5, 7, 7, 7], 'HOUR3':[8 ,8 ,8, 12 ,12 ,12, 99, 99, 99]}, index=s)
    
    df = df.set_index(['BORDER'], append=True)
    df.columns.name = 'HOUR'
    df = df.unstack('BORDER')
    df = df.stack('HOUR')
    df = df.reset_index('HOUR')
    df['HOUR'] = df['HOUR'].str.replace('HOUR', '').astype('int')
    print(df)
    

    yields

    BORDER      HOUR  FRANCE  GERMANY  ITALY
    2014-01-01     1       2        2      2
    2014-01-01     2       3        3      3
    2014-01-01     3       8        8      8
    2014-01-02     1       4        4      4
    2014-01-02     2       5        5      5
    2014-01-02     3      12       12     12
    2014-01-03     1       6        6      6
    2014-01-03     2       7        7      7
    2014-01-03     3      99       99     99
    
    0 讨论(0)
  • 2020-12-14 04:07

    Try using pivot. You can make it in one line. Eg.

    df.pivot(index='start_time', columns='venue_name', values='ocupation')

    0 讨论(0)
  • 2020-12-14 04:29

    Using your df2:

    >>> df2.pivot_table(values='value', index=['DATE', 'variable'], columns="BORDER")
    BORDER               FRANCE  GERMANY  ITALY
    DATE       variable                        
    2014-01-01 HOUR1          2        2      2
               HOUR2          3        3      3
               HOUR3          8        8      8
    2014-01-02 HOUR1          4        4      4
               HOUR2          5        5      5
               HOUR3         12       12     12
    2014-01-03 HOUR1          6        6      6
               HOUR2          7        7      7
               HOUR3         99       99     99
    
    [9 rows x 3 columns]
    

    There is still a bit of cleanup to do if you want to convert the index level "variable" into a column called "HOUR" and strip out the text "HOUR" from the values, but I think that is the basic format you want.

    0 讨论(0)
提交回复
热议问题