问题
I find myself often having to check whether a column or row exists in a dataframe before trying to reference it. For example I end up adding a lot of code like:
if 'mycol' in df.columns and 'myindex' in df.index: x = df.loc[myindex, mycol]
else: x = mydefault
Is there any way to do this more nicely? For example on an arbitrary object I can do x = getattr(anobject, 'id', default) - is there anything similar to this in pandas? Really any way to achieve what I'm doing more gracefully?
回答1:
There is a method for Series:
So you could do:
df.mycol.get(myIndex, NaN)
Example:
In [117]:
df = pd.DataFrame({'mycol':arange(5), 'dummy':arange(5)})
df
Out[117]:
dummy mycol
0 0 0
1 1 1
2 2 2
3 3 3
4 4 4
[5 rows x 2 columns]
In [118]:
print(df.mycol.get(2, NaN))
print(df.mycol.get(5, NaN))
2
nan
回答2:
Python has this mentality to ask for forgiveness instead of permission. You'll find a lot of posts on this matter, such as this one.
In Python catching exceptions is relatively inexpensive, so you're encouraged to use it. This is called the EAFP approach.
For example:
try:
x = df.loc['myindex', 'mycol']
except KeyError:
x = mydefault
来源:https://stackoverflow.com/questions/23403352/return-default-if-pandas-dataframe-loc-location-doesnt-exist