return default if pandas dataframe.loc location doesn't exist

与世无争的帅哥 提交于 2021-02-06 14:20:55

问题


I find myself often having to check whether a column or row exists in a dataframe before trying to reference it. For example I end up adding a lot of code like:

if 'mycol' in df.columns and 'myindex' in df.index: x = df.loc[myindex, mycol]
else: x = mydefault

Is there any way to do this more nicely? For example on an arbitrary object I can do x = getattr(anobject, 'id', default) - is there anything similar to this in pandas? Really any way to achieve what I'm doing more gracefully?


回答1:


There is a method for Series:

So you could do:

df.mycol.get(myIndex, NaN)

Example:

In [117]:

df = pd.DataFrame({'mycol':arange(5), 'dummy':arange(5)})
df
Out[117]:
   dummy  mycol
0      0      0
1      1      1
2      2      2
3      3      3
4      4      4

[5 rows x 2 columns]
In [118]:

print(df.mycol.get(2, NaN))
print(df.mycol.get(5, NaN))
2
nan



回答2:


Python has this mentality to ask for forgiveness instead of permission. You'll find a lot of posts on this matter, such as this one.

In Python catching exceptions is relatively inexpensive, so you're encouraged to use it. This is called the EAFP approach.

For example:

try:
    x = df.loc['myindex', 'mycol']
except KeyError:
    x = mydefault


来源:https://stackoverflow.com/questions/23403352/return-default-if-pandas-dataframe-loc-location-doesnt-exist

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!