Group by and sum over rows with same contents [duplicate]

谁都会走 提交于 2020-01-30 12:44:37

问题


I have a data frame of 3 columns with numerical values, first two columns are a set with two elements. I want to treat the rows of these 2 columns as a set (that contains the same elements) and group by + sum:


df.groupby([A,B]).sum() --- won't work here 

example:

 A        B   counter
750     1334    10
1080    1920    15
1080    1920    10
1920    1080    10
1125    2436    20

result :

 A        B   counter
750     1334    10
1080    1920    35
1125    2436    20

回答1:


Idea is sorting both columns by numpy.sort and reassign back:

df[['A','B']] = np.sort(df[['A','B']], axis=1)

df = df.groupby(['A','B'], as_index=False)['counter'].sum()
print (df)
      A     B  counter
0   750  1334       10
1  1080  1920       35
2  1125  2436       20

Or assign to array passed to groupby:

arr = np.sort(df[['A','B']], axis=1)
df = df.groupby([arr[:, 0],arr[:, 1]])['counter'].sum().rename_axis(('A','B')).reset_index()
print (df)
      A     B  counter
0   750  1334       10
1  1080  1920       35
2  1125  2436       20


来源:https://stackoverflow.com/questions/56650372/group-by-and-sum-over-rows-with-same-contents

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!