How to do a groupby on an empty set of columns in Pandas?

别来无恙 提交于 2019-12-23 20:24:26

问题


I am hitting on a corner case in pandas. I am trying to use the agg fn but without doing a groupby. Say I want an aggregation on the entire dataframe, i.e.

from pandas import *
DF = DataFrame( randn(5,3), index = list( "ABCDE"), columns = list("abc") )
DF.groupby([]).agg({'a' : np.sum, 'b' : np.mean } ) # <--- does not work

And DF.agg( {'a' ... } ) does not work either.

My workaround is to do DF['Total'] = 'Total' then do a DF.groupby(['Total']) but this seems a bit artificial.

Has anyone got a cleaner solution?


回答1:


It's not so great either, but for this case, if you pass a function returning True at least it wouldn't require changing df:

>>> from pandas import *
>>> df = DataFrame( np.random.randn(5,3), index = list( "ABCDE"), columns = list("abc") )
>>> df.groupby(lambda x: True).agg({'a' : np.sum, 'b' : np.mean } )
             a         b
True  1.836649 -0.692655
>>> 
>>> df['total'] = 'total'
>>> df.groupby(['total']).agg({'a' : np.sum, 'b' : np.mean } ) 
              a         b
total                    
total  1.836649 -0.692655

You could use various builtins instead of lambda x: True but they're less explicit and only work accidentally.




回答2:


Having an analogous DataFrame.aggregate method is a good idea. Creating an issue here:

https://github.com/pydata/pandas/issues/1623



来源:https://stackoverflow.com/questions/11492215/how-to-do-a-groupby-on-an-empty-set-of-columns-in-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!