How to create new columns depending on row value in pandas

后端 未结 1 711
离开以前
离开以前 2021-01-25 08:13

I have a dataframe that looks like this:

    time  speaker  label_1  label_2
0   0.25        1       10        4
1   0.25        2       10        5
2   0.50             


        
1条回答
  •  一整个雨季
    2021-01-25 08:45

    First we use pivot_table to pivot our rows to columns. Then we create our desired column names by string concatenating with list_comprehension and f-string:

    piv = df.pivot_table(index='time', columns='speaker')
    piv.columns = [f'spk_{col[1]}_{col[0]}' for col in piv.columns]
    
          spk_1_label_1  spk_2_label_1  spk_1_label_2  spk_2_label_2
    time                                                            
    0.25             10             10              4              5
    0.50             10             10              6              7
    0.75             10             10              8              9
    1.00             10             10             11             12
    1.25             11             11             13             14
    1.50             11             11             15             16
    1.75             11             11             17             18
    2.00             11             11             19             20
    

    If you want to remove the index name:

    piv.rename_axis(None, inplace=True)
    
          spk_1_label_1  spk_2_label_1  spk_1_label_2  spk_2_label_2
    0.25             10             10              4              5
    0.50             10             10              6              7
    0.75             10             10              8              9
    1.00             10             10             11             12
    1.25             11             11             13             14
    1.50             11             11             15             16
    1.75             11             11             17             18
    2.00             11             11             19             20
    

    Extra

    If you want, we can make it more general by using the column name as prefix for your flattened columns:

    piv.columns = [f'{piv.columns.names[1]}_{col[1]}_{col[0]}' for col in piv.columns]
    
          speaker_1_label_1  speaker_2_label_1  speaker_1_label_2  speaker_2_label_2
    time                                                                            
    0.25                 10                 10                  4                  5
    0.50                 10                 10                  6                  7
    0.75                 10                 10                  8                  9
    1.00                 10                 10                 11                 12
    1.25                 11                 11                 13                 14
    1.50                 11                 11                 15                 16
    1.75                 11                 11                 17                 18
    2.00                 11                 11                 19                 20
    

    Notice: if your python version < 3.5, you can't use f-strings, we can use .format for our string formatting:

    ['spk_{}_{}'.format(col[0], col[1]) for col in piv.columns]
    

    0 讨论(0)
提交回复
热议问题