Use Pandas groupby() + apply() with arguments

妖精的绣舞 提交于 2019-12-18 11:44:55

问题


I would like to use df.groupby() in combination with apply() to apply a function to each row per group.

I normally use the following code, which usually works (note, that this is without groupby()):

df.apply(myFunction, args=(arg1,))

With the groupby() I tried the following:

df.groupby('columnName').apply(myFunction, args=(arg1,))

However, I get the following error:

TypeError: myFunction() got an unexpected keyword argument 'args'

Hence, my question is: How can I use groupby() and apply() with a function that needs arguments?


回答1:


pandas.core.groupby.GroupBy.apply does NOT have named parameter args, but pandas.DataFrame.apply does have it.

So try this:

df.groupby('columnName').apply(lambda x: myFunction(x, arg1))

or as suggested by @Zero:

df.groupby('columnName').apply(myFunction, ('arg1'))

Demo:

In [82]: df = pd.DataFrame(np.random.randint(5,size=(5,3)), columns=list('abc'))

In [83]: df
Out[83]:
   a  b  c
0  0  3  1
1  0  3  4
2  3  0  4
3  4  2  3
4  3  4  1

In [84]: def f(ser, n):
    ...:     return ser.max() * n
    ...:

In [85]: df.apply(f, args=(10,))
Out[85]:
a    40
b    40
c    40
dtype: int64

when using GroupBy.apply you can pass either a named arguments:

In [86]: df.groupby('a').apply(f, n=10)
Out[86]:
    a   b   c
a
0   0  30  40
3  30  40  40
4  40  20  30

a tuple of arguments:

In [87]: df.groupby('a').apply(f, (10))
Out[87]:
    a   b   c
a
0   0  30  40
3  30  40  40
4  40  20  30



回答2:


Some confusion here over why using an args parameter throws an error might stem from the fact that pandas.DataFrame.apply does have an args parameter (a tuple), while pandas.core.groupby.GroupBy.apply does not.

So, when you call .apply on a DataFrame itself, you can use this argument; when you call .apply on a groupby object, you cannot.

In @MaxU's answer, the expression lambda x: myFunction(x, arg1) is passed to func (the first parameter); there is no need to specify additional *args/**kwargs because arg1 is specified in lambda.

An example:

import numpy as np
import pandas as pd

# Called on DataFrame - `args` is a 1-tuple
# `0` / `1` are just the axis arguments to np.sum
df.apply(np.sum, axis=0)  # equiv to df.sum(0)
df.apply(np.sum, axis=1)  # equiv to df.sum(1)


# Called on groupby object of the DataFrame - will throw TypeError
print(df.groupby('col1').apply(np.sum, args=(0,)))
# TypeError: sum() got an unexpected keyword argument 'args'



回答3:


For me

df2 = df.groupby('columnName').apply(lambda x: my_function(x, arg1, arg2,))

worked



来源:https://stackoverflow.com/questions/43483365/use-pandas-groupby-apply-with-arguments

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!