问题
I'm learning Pandas and Numpy, currently going through this section of the tutorial. I'm new to Python altogether, so this is probably a basic beginner's question.
Given this data frame:
df = pd.DataFrame(np.random.randn(4, 3), columns=['A', 'B', 'C'],
index=pd.date_range('1/1/2000', periods=4))
df.iloc[3:7] = np.nan
I can't explain the difference between the following results of df.agg:
Call 1:
df.agg(sum)
#Result:
A NaN
B NaN
C NaN
dtype: float64
Call 2:
df.agg('sum')
#Result:
A -1.776752
B -2.070156
C -0.124162
dtype: float64
The result of df.agg('sum')
is the same as that of df.agg(np.sum)
or df.sum()
. I expected df.agg('sum')
to produce the same result as df.agg(sum)
.
Does Pandas have special logic to resolve these functions such that it would prefer np.sum
(or run df.sum
) instead of the built-in sum
?
回答1:
In the documentation you linked to, it says:
You can also pass named methods as strings.
So strings are resolved as method names on the DataFrame (or Series, if you call agg
on a Series).
来源:https://stackoverflow.com/questions/53436538/how-does-pandas-resolve-the-function-specified-by-name-in-df-agg