Pandas - dataframe groupby - how to get sum of multiple columns

你说的曾经没有我的故事 提交于 2019-11-27 22:13:39

By using apply

df.groupby(['col1', 'col2'])["col3", "col4"].apply(lambda x : x.astype(int).sum())
Out[1257]: 
           col3  col4
col1 col2            
a    c        2     4
     d        1     2
b    d        1     2
     e        2     4

If you want to agg

df.groupby(['col1', 'col2']).agg({'col3':'sum','col4':'sum'})

Another generic solution is

df.groupby(['col1','col2']).agg({'col3':'sum','col4':'sum'}).reset_index()

This will give you the required output.

The issue is likely that df.col3.dtype is likely not an int or a numeric datatype. Try df.col3 = df.col3.astype(int) before doing your groupby

Additionally, select your columns after the groupby to see if the columns are even being aggregated:

df_new = df.groupby(['col1', 'col2']).sum()[["col3", "col4"]]

The above answer didn't work for me.

df_new = df.groupby(['col1', 'col2']).sum()[["col3", "col4"]]

I was grouping by single group by and sum columns.

Here is the one worked for me.

D1.groupby(['col1'])['col2'].sum() << The sum at the end not the middle.
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!