How to summarize on different groupby combinations?

后端 未结 5 1556
后悔当初
后悔当初 2020-12-04 02:59

I am compiling a table of top-3 crops by county. Some counties have the same crop varieties in the same order. Other counties have the same crop varieties in a different ord

5条回答
  •  温柔的废话
    2020-12-04 03:47

    Here is one way to do it.

    First let's get the unique values across the columns and then reassign these values back to the DataFrame. We will perform this on a copy of the original data since you might need to preserve the original data.

    df = df1.copy()
    
    to_sum = ['Crop1', 'Crop2', 'Crop3']
    
    df[to_sum] = pd.DataFrame(df.loc[:, to_sum] \
                                .apply(set, axis=1) \
                                .apply(sorted) \
                                .values \
                                .tolist(), columns=to_sum)
    
    print(df)
    
           County  Crop1    Crop2    Crop3  Total_pop
    0      Harney  grain   apples   melons       2000
    1       Baker  grain   apples   melons       1500
    2     Wheeler  grain   apples   melons       3000
    3  Hood River  grain   apples   melons       1500
    4       Wasco  pears  carrots  raddish       2000
    5      Morrow  pears  carrots  raddish       2500
    6       Union  pears  carrots  raddish       2700
    7        Lake  pears  carrots  raddish       2000
    

    Now we can perform our groupby to get the desired results.

    df.groupby(to_sum).Total_pop.sum()
    
    Crop1    Crop2  Crop3  
    apples   grain  melons     8000
    carrots  pears  raddish    9200
    Name: Total_pop, dtype: int64
    

提交回复
热议问题