可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

What is the rule/process when a function is called with pandas apply() through lambda vs. not? Examples below. Without lambda apparently, the entire series ( df[column name] ) is passed to the "test" function which throws an error trying to do a boolean operation on a series.

If the same function is called via lambda it works. Iteration over each row with each passed as "x" and the df[ column name ] returns a single value for that column in the current row.

It's like lambda is removing a dimension. Anyone have an explanation or point to the specific doc on this? Thanks.

Example 1 with lambda, works OK

print("probPredDF columns:", probPredDF.columns)  def test( x, y):     if x==y:         r = 'equal'     else:         r = 'not equal'     return r      probPredDF.apply( lambda x: test( x['yTest'], x[ 'yPred']), axis=1 ).head()

Example 1 output

probPredDF columns: Index([0, 1, 'yPred', 'yTest'], dtype='object')  Out[215]: 0    equal 1    equal 2    equal 3    equal 4    equal dtype: object

Example 2 without lambda, throws boolean operation on series error

print("probPredDF columns:", probPredDF.columns)  def test( x, y):     if x==y:         r = 'equal'     else:         r = 'not equal'     return r      probPredDF.apply( test( probPredDF['yTest'], probPredDF[ 'yPred']), axis=1 ).head()

Example 2 output

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

回答1:

There is nothing magic about a lambda. They are functions in one parameter, that can be defined inline, and do not have a name. You can use a function where a lambda is expected, but the function will need to also take one parameter. You need to do something like...

Define it as:

def wrapper(x):     return test(x['yTest'], x['yPred'])

Use it as:

probPredDF.apply(wrapper, axis=1)

文章来源: pandas apply() with and without lambda

标签

pandas

lambda

test

apply