I have a dataframe of shape (40,500). Each row in the dataframe has some numerical values till some variable column number k, and all the entries after that are nan.
<
Here's a NumPy based solution -
In [113]: a
Out[113]:
array([[ 17., 53., nan, 63., 66., nan, nan, nan, nan, nan],
[ 54., 96., 71., 20., 70., 58., 91., nan, nan, nan],
[ 58., 26., 72., 93., 58., 29., 44., 28., 36., 88.],
[ nan, nan, nan, nan, nan, nan, nan, nan, nan, nan],
[ 94., 23., nan, nan, 92., 81., 40., 30., 84., nan]])
In [114]: m = ~np.isnan(a)
In [115]: a[np.arange(m.shape[0]), m.shape[1]-m[:,::-1].argmax(1)-1]
Out[115]: array([ 66., 91., 88., nan, 84.])
To port this for dataframe, first off we can extract the values as an array : a = df.values and finally make the output dataframe :
vals = a[np.arange(m.shape[0]), m.shape[1]-m[:,::-1].argmax(1)-1]
df_out = pd.DataFrame(vals,index=df.index)