Aggregate unique values from multiple columns with pandas GroupBy

后端 未结 3 2091
借酒劲吻你
借酒劲吻你 2020-12-18 00:56

I went into countless threads (1 2 3...) and still I don\'t find a solution to my problem... I have a dataframe like this:

prop1 prop2 prop3    prop4 
L30            


        
相关标签:
3条回答
  • 2020-12-18 01:03

    Try this, it worked for me perfectly:

    df.groupby(['prop1','prop2', 'prop4']).agg(lambda x: ','.join(x.unique())).reset_index()
    

    This will give the result as:

    prop1  prop2       prop3       prop4
    L30    3,54,11,10  bob,john    11.2,10
    K20    12,1,66     travis,leo  10,4
    
    0 讨论(0)
  • 2020-12-18 01:21

    melt + pivot_table

    s = df.astype(str).melt(id_vars='prop1').drop_duplicates()
    
    s.pivot_table(
      index='prop1',
      columns='variable',
      values='value',
      aggfunc=','.join)
    

    variable       prop2       prop3      prop4
    prop1
    K20          12,1,66  travis,leo   10.0,4.0
    L30       3,54,11,10    bob,john  11.2,10.0
    
    0 讨论(0)
  • 2020-12-18 01:29

    Use groupby and agg, and aggregate only unique values by calling Series.unique:

    df.astype(str).groupby('prop1').agg(lambda x: ','.join(x.unique()))
    
                prop2       prop3      prop4
    prop1                                   
    K20       12,1,66  travis,leo   10.0,4.0
    L30    3,54,11,10    bob,john  11.2,10.0
    

    df.astype(str).groupby('prop1', sort=False).agg(lambda x: ','.join(x.unique()))
    
                prop2       prop3      prop4
    prop1                                   
    L30    3,54,11,10    bob,john  11.2,10.0
    K20       12,1,66  travis,leo   10.0,4.0
    

    If handling NaNs is important, call fillna in advance:

    import re
    df.fillna('').astype(str).groupby('prop1').agg(
        lambda x: re.sub(',+', ',', ','.join(x.unique()))
    )
    
                prop2       prop3      prop4
    prop1                                   
    K20       12,1,66  travis,leo   10.0,4.0
    L30    3,54,11,10    bob,john  11.2,10.0
    
    0 讨论(0)
提交回复
热议问题