Pandas - How to flatten a hierarchical index in columns

后端 未结 17 1355
忘掉有多难
忘掉有多难 2020-11-22 02:55

I have a data frame with a hierarchical index in axis 1 (columns) (from a groupby.agg operation):

     USAF   WBAN  year  month  day  s_PC  s_CL         


        
17条回答
  •  迷失自我
    2020-11-22 03:18

    The easiest and most intuitive solution for me was to combine the column names using get_level_values. This prevents duplicate column names when you do more than one aggregation on the same column:

    level_one = df.columns.get_level_values(0).astype(str)
    level_two = df.columns.get_level_values(1).astype(str)
    df.columns = level_one + level_two
    

    If you want a separator between columns, you can do this. This will return the same thing as Seiji Armstrong's comment on the accepted answer that only includes underscores for columns with values in both index levels:

    level_one = df.columns.get_level_values(0).astype(str)
    level_two = df.columns.get_level_values(1).astype(str)
    column_separator = ['_' if x != '' else '' for x in level_two]
    df.columns = level_one + column_separator + level_two
    

    I know this does the same thing as Andy Hayden's great answer above, but I think it is a bit more intuitive this way and is easier to remember (so I don't have to keep referring to this thread), especially for novice pandas users.

    This method is also more extensible in the case where you may have 3 column levels.

    level_one = df.columns.get_level_values(0).astype(str)
    level_two = df.columns.get_level_values(1).astype(str)
    level_three = df.columns.get_level_values(2).astype(str)
    df.columns = level_one + level_two + level_three
    

提交回复
热议问题