python slicing does not give key error even when the column is missing

空扰寡人 提交于 2019-12-24 11:56:11


I have a pandas dataframe with 10 keys. If I try to access a column that is not present, even then it returns a NaN for this. I was expecting a KeyError. How is pandas not able to identify the missing column ?

In the example below, vendor_id is a valid column in dataframe. The other column is absent from the dataset.

final_feature.ix[:,['vendor_id','this column is absent']]
  vendor_id  this column is absent
0    434236                    NaN

Out[1016]: pandas.core.frame.DataFrame

EDIT 1: Validated that no null values are there

print (final_feature1.isnull().values.any())


This is expected behaviour and is due to the feature setting with enlargement

In [15]:
df = pd.DataFrame(np.random.randn(5,3), columns=list('abc'))

          a   d
0 -1.164349 NaN
1  0.400116 NaN
2 -0.599496 NaN
3  0.186837 NaN
4  0.385656 NaN

If you try df['d'] or df[['a','d']] then you will get a KeyError

Effectively what you're doing is reindexing, the fact the column doesn't exists when using ix doesn't matter, you'll just get a column of NaNs

Same behaviour is observed using loc:

In [24]:

          a   d
0 -1.164349 NaN
1  0.400116 NaN
2 -0.599496 NaN
3  0.186837 NaN
4  0.385656 NaN

When you don't use ix or loc and try to do df['d'] you're trying to index a specific column or list of columns, there is no expectation of enlargement here unless you are assigning to a new column: e.g. df['d'] = some_new_vals

To guard against this you can validate your list using isin with the columns:

In [26]:
valid_cols = df.columns.isin(['a','d'])
df.ix[:, valid_cols]

0 -1.164349
1  0.400116
2 -0.599496
3  0.186837
4  0.385656

Now you will only see columns that exist, plus if you have mis-spelt any columns then it will also guard against this


For me works select by subset:

final_feature[['vendor_id','this column is absent']]

KeyError: "['this column is absent'] not in index"

Also ix is deprecated in last version of pandas (0.20.1), check here.

