Pandas: sum DataFrame rows for given columns

匿名 (未验证) 提交于 2019-12-03 02:50:02

问题:

I have the following DataFrame:

import pandas as pd df = pd.DataFrame({'a': [1,2,3], 'b': [2,3,4], 'c':['dd','ee','ff'], 'd':[5,9,1]}) 

I would like to add a column 'e' which is the sum of column 'a', 'b' and 'd'.

Going across forums, I thought something like this would work:

df['e'] = df[['a','b','d']].map(sum) 

But no!

I would like to realize the operation having the list of columns ['a','b','d'] and df as inputs.

回答1:

You can just sum and set param axis=1 to sum the rows, this will ignore none numeric columns:

In [91]:  df = pd.DataFrame({'a': [1,2,3], 'b': [2,3,4], 'c':['dd','ee','ff'], 'd':[5,9,1]}) df['e'] = df.sum(axis=1) df Out[91]:    a  b   c  d   e 0  1  2  dd  5   8 1  2  3  ee  9  14 2  3  4  ff  1   8 

If you want to just sum specific columns then you can create a list of the columns and remove the ones you are not interested in:

In [98]:  col_list= list(df) col_list.remove('d') col_list Out[98]: ['a', 'b', 'c'] In [99]:  df['e'] = df[col_list].sum(axis=1) df Out[99]:    a  b   c  d  e 0  1  2  dd  5  3 1  2  3  ee  9  5 2  3  4  ff  1  7 


回答2:

If you have just a few columns to sum, you can write:

df['e'] = df['a'] + df['b'] + df['d'] 

This creates new column e with the values:

   a  b   c  d   e 0  1  2  dd  5   8 1  2  3  ee  9  14 2  3  4  ff  1   8 

For longer lists of columns, EdChum's answer is preferred.



回答3:

This is a simpler way using iloc to select which columns to sum:

df['f']=df.iloc[:,0:2].sum(axis=1) df['g']=df.iloc[:,[0,1]].sum(axis=1) df['h']=df.iloc[:,[0,3]].sum(axis=1) 

Produces:

   a  b   c  d   e  f  g   h 0  1  2  dd  5   8  3  3   6 1  2  3  ee  9  14  5  5  11 2  3  4  ff  1   8  7  7   4 

I can't find a way to combine a range and specific columns that works e.g. something like:

df['i']=df.iloc[:,[[0:2],3]].sum(axis=1) df['i']=df.iloc[:,[0:2,3]].sum(axis=1) 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!