Pandas DataFrame.groupby() to dictionary with multiple columns for value

后端 未结 3 1425
执笔经年
执笔经年 2020-12-31 12:25
type(Table)
pandas.core.frame.DataFrame

Table
======= ======= =======
Column1 Column2 Column3
0       23      1
1       5       2
1       2       3
1       19               


        
3条回答
  •  自闭症患者
    2020-12-31 13:13

    One way is to create a new tup column and then create the dictionary.

    df['tup'] = list(zip(df['Column2'], df['Column3']))
    df.groupby('Column1')['tup'].apply(list).to_dict()
    
    # {0: [(23, 1)],
    #  1: [(5, 2), (2, 3), (19, 5)],
    #  2: [(56, 1), (22, 2)],
    #  3: [(2, 4), (14, 5)],
    #  4: [(59, 1)],
    #  5: [(44, 1), (1, 2), (87, 3)]}
    

    @Psidom's solution is more efficient, but if performance isn't an issue use what makes more sense to you:

    df = pd.concat([df]*10000)
    
    def jp(df):
        df['tup'] = list(zip(df['Column2'], df['Column3']))
        return df.groupby('Column1')['tup'].apply(list).to_dict()
    
    def psi(df):
        return df.groupby('Column1')[['Column2', 'Column3']].apply(lambda g: list(map(tuple, g.values.tolist()))).to_dict()
    
    %timeit jp(df)   # 110ms
    %timeit psi(df)  # 80ms
    

提交回复
热议问题