Sum two rows if two cells are the same but in different order

给你一囗甜甜゛ 提交于 2021-02-05 08:37:12

问题


Similar to below

Buyer Seller Amount
John  Mary   3
Mary  John   2
David Bosco  2

Where I want to sum John and Mary rows into one

Expected out come

Trade1 Trade2 Amount
John   Mary   5
David  Bosco  2

My dataframe has around 6000 rows. Thank you for your help


回答1:


First sort values by numpy.sort and create boolean mask by DataFrame.duplicated and then aggregate sum:

df[['Buyer','Seller']] = pd.DataFrame(np.sort(df[['Buyer','Seller']], axis=1))

df2 = df.groupby(['Buyer','Seller'], as_index=False)['Amount'].sum()
df2.columns = ['Trade1','Trade2','Amount']
print (df2)
  Trade1 Trade2  Amount
0  Bosco  David       2
1   John   Mary       5

If dont want modify original columns use syntactic sugar - groupby with Series:

df1 = pd.DataFrame(np.sort(df[['Buyer','Seller']], axis=1))
df1.columns = ['Trade1','Trade2']

df2 = df['Amount'].groupby([df1['Trade1'],df1['Trade2']]).sum().reset_index()
print (df2)
  Trade1 Trade2  Amount
0  Bosco  David       2
1   John   Mary       5


来源:https://stackoverflow.com/questions/51874962/sum-two-rows-if-two-cells-are-the-same-but-in-different-order

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!