Selecting column after groupby without using explicit column name

问题

With the following dataset:

import pandas as pd
df = pd.DataFrame({'Date':['26-12-2018','26-12-2018','27-12-2018','27-12-2018','28-12-2018','28-12-2018'],
                   'In':['A','B','D','Z','Q','E'],
                   'Out' : ['Z', 'D', 'F', 'H', 'Z', 'A'],
                   'Score_in' : ['6', '2', '1', '0', '1', '3'], 
                   'Score_out' : ['2','3','0', '1','1','3'],
                   'Place' : ['One','Two','Four', 'Two','Two','One']})

I would like to code groupby rules on a generic form in order to try parameterizing subsets creation. For instance, instead of the following:

df.groupby('In').Score_in.sum()

I suppose my desired output would be something like #1 or #2 with df.columns[] or .iloc[:,[]] syntaxes like:

df.groupby(df.columns[1]).df.iloc[:,[3]].sum() #1
df.groupby(df.iloc[:,[0]]).df.iloc[:,[3]].sum() #2

Of course, none of the above syntaxes works. Any help?

回答1:

Actually the problem is not with the groupby, it's about how you keep a particular column afterwards. groupby has no df attribute, so it can't work this way.

Here is a piece of code that works as expected:

df.groupby(df.columns[1])[df.columns[3]].sum()

In  Score_in
A   6
B   2
D   1
E   3
Q   1
Z   0

Notice: I casted Score_in and Score_out as integers or else the groupby would'nt work.

来源：https://stackoverflow.com/questions/62589067/selecting-column-after-groupby-without-using-explicit-column-name

标签

python

pandas

dataframe

functional-programming

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!