pandas apply() with and without lambda

问题

What is the rule/process when a function is called with pandas apply() through lambda vs. not? Examples below. Without lambda apparently, the entire series ( df[column name] ) is passed to the "test" function which throws an error trying to do a boolean operation on a series.

If the same function is called via lambda it works. Iteration over each row with each passed as "x" and the df[ column name ] returns a single value for that column in the current row.

It's like lambda is removing a dimension. Anyone have an explanation or point to the specific doc on this? Thanks.

Example 1 with lambda, works OK

print("probPredDF columns:", probPredDF.columns)

def test( x, y):
    if x==y:
        r = 'equal'
    else:
        r = 'not equal'
    return r    

probPredDF.apply( lambda x: test( x['yTest'], x[ 'yPred']), axis=1 ).head()

Example 1 output

probPredDF columns: Index([0, 1, 'yPred', 'yTest'], dtype='object')

Out[215]:
0    equal
1    equal
2    equal
3    equal
4    equal
dtype: object

Example 2 without lambda, throws boolean operation on series error

print("probPredDF columns:", probPredDF.columns)

def test( x, y):
    if x==y:
        r = 'equal'
    else:
        r = 'not equal'
    return r    

probPredDF.apply( test( probPredDF['yTest'], probPredDF[ 'yPred']), axis=1 ).head()

Example 2 output

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

回答1:

There is nothing magic about a lambda. They are functions in one parameter, that can be defined inline, and do not have a name. You can use a function where a lambda is expected, but the function will need to also take one parameter. You need to do something like...

Define it as:

def wrapper(x):
    return test(x['yTest'], x['yPred'])

Use it as:

probPredDF.apply(wrapper, axis=1)

来源：https://stackoverflow.com/questions/43810094/pandas-apply-with-and-without-lambda

标签

pandas

lambda

apply