Aggregate unique values from multiple columns with pandas GroupBy

后端 未结 3 2100
借酒劲吻你
借酒劲吻你 2020-12-18 00:56

I went into countless threads (1 2 3...) and still I don\'t find a solution to my problem... I have a dataframe like this:

prop1 prop2 prop3    prop4 
L30            


        
3条回答
  •  情书的邮戳
    2020-12-18 01:29

    Use groupby and agg, and aggregate only unique values by calling Series.unique:

    df.astype(str).groupby('prop1').agg(lambda x: ','.join(x.unique()))
    
                prop2       prop3      prop4
    prop1                                   
    K20       12,1,66  travis,leo   10.0,4.0
    L30    3,54,11,10    bob,john  11.2,10.0
    

    df.astype(str).groupby('prop1', sort=False).agg(lambda x: ','.join(x.unique()))
    
                prop2       prop3      prop4
    prop1                                   
    L30    3,54,11,10    bob,john  11.2,10.0
    K20       12,1,66  travis,leo   10.0,4.0
    

    If handling NaNs is important, call fillna in advance:

    import re
    df.fillna('').astype(str).groupby('prop1').agg(
        lambda x: re.sub(',+', ',', ','.join(x.unique()))
    )
    
                prop2       prop3      prop4
    prop1                                   
    K20       12,1,66  travis,leo   10.0,4.0
    L30    3,54,11,10    bob,john  11.2,10.0
    

提交回复
热议问题