I would like to fill missing values in a pandas dataframe with the average of the cells directly before and after the missing value. So if it was [1, NaN, 3], the NaN value
This will work if you don't have any NaN values as the last index, which is implied to be true by your imputation methodology.
>>> data = pd.DataFrame({'a': [10, 6, -3, -2, 4, 12, 3, 3],
'b': [6, -3, np.nan, 12, 8, 11, -5, -5],
'id': [1, 1, 1, 1, np.nan, 2, 2, 4]})
>>> data
a b id
0 10 6.0 1.0
1 6 -3.0 1.0
2 -3 NaN 1.0
3 -2 12.0 1.0
4 4 8.0 NaN
5 12 11.0 2.0
6 3 -5.0 2.0
7 3 -5.0 4.0
>>> nan_cols = data.columns[data.isnull().any(axis=0)]
>>> for col in nan_cols:
... for i in range(len(data)):
... if pd.isnull(data.loc[i, col]):
... data.loc[i, col] = (data.loc[i-1, col] + data.loc[i+1, col])/2
>>> data
a b id
0 10 6.0 1.0
1 6 -3.0 1.0
2 -3 4.5 1.0
3 -2 12.0 1.0
4 4 8.0 1.5
5 12 11.0 2.0
6 3 -5.0 2.0
7 3 -5.0 4.0