Pandas groupby result into multiple columns

前端 未结 3 538
渐次进展
渐次进展 2020-12-31 16:05

I have a dataframe in which I\'m looking to group and then partition the values within a group into multiple columns.

For example: say I have the following dataframe

3条回答
  •  余生分开走
    2020-12-31 16:33

    You could use

    id_df = grouped['ID'].apply(lambda x: pd.Series(x.values)).unstack()
    

    to create id_df without the intermediate result DataFrame.


    import pandas as pd
    import numpy as np
    np.random.seed(2016)
    
    df = pd.DataFrame({'Group': ['A', 'C', 'B', 'A', 'C', 'C'],
                       'ID': [1, 2, 3, 4, 5, 6],
                       'Value': np.random.randint(1, 100, 6)})
    
    grouped = df.groupby('Group')
    values = grouped['Value'].agg('sum')
    id_df = grouped['ID'].apply(lambda x: pd.Series(x.values)).unstack()
    id_df = id_df.rename(columns={i: 'ID{}'.format(i + 1) for i in range(id_df.shape[1])})
    result = pd.concat([id_df, values], axis=1)
    print(result)
    

    yields

           ID1  ID2  ID3  Value
    Group                      
    A        1    4  NaN     77
    B        3  NaN  NaN     84
    C        2    5    6     86
    

提交回复
热议问题