Getting last non na value across rows in a pandas dataframe

前端未结

关注

 3  2082

北海茫月 2020-12-11 05:22

I have a dataframe of shape (40,500). Each row in the dataframe has some numerical values till some variable column number k, and all the entries after that are nan.

3条回答

隐瞒了意图╮ (楼主)

2020-12-11 05:50

You need last_valid_index with custom function, because if all values are NaN it return KeyError:

def f(x):
    if x.last_valid_index() is None:
        return np.nan
    else:
        return x[x.last_valid_index()]

df['status'] = df.apply(f, axis=1)
print (df)
                1      2      3      4      5      6      7      8      9  \
0                                                                           
2016-06-02  7.080  7.079  7.079  7.079  7.079  7.079    NaN    NaN    NaN   
2016-06-08  7.053  7.053  7.053  7.053  7.053  7.054    NaN    NaN    NaN   
2016-06-09  7.061  7.061  7.060  7.060  7.060  7.060    NaN    NaN    NaN   
2016-06-14    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN   
2016-06-15  7.066  7.066  7.066  7.066    NaN    NaN    NaN    NaN    NaN   
2016-06-16  7.067  7.067  7.067  7.067  7.067  7.067  7.068  7.068    NaN   
2016-06-21  7.053  7.053  7.052    NaN    NaN    NaN    NaN    NaN    NaN   
2016-06-22  7.049  7.049    NaN    NaN    NaN    NaN    NaN    NaN    NaN   
2016-06-28  7.058  7.058  7.059  7.059  7.059  7.059  7.059  7.059  7.059   

            status  
0                   
2016-06-02   7.079  
2016-06-08   7.054  
2016-06-09   7.060  
2016-06-14     NaN  
2016-06-15   7.066  
2016-06-16   7.068  
2016-06-21   7.052  
2016-06-22   7.049  
2016-06-28   7.059

Alternative solution - fillna with method ffill and select last column by iloc:

df['status'] = df.ffill(axis=1).iloc[:, -1]
print (df)
            status  
0                   
2016-06-02   7.079  
2016-06-08   7.054  
2016-06-09   7.060  
2016-06-14     NaN  
2016-06-15   7.066  
2016-06-16   7.068  
2016-06-21   7.052  
2016-06-22   7.049  
2016-06-28   7.059

0 讨论(0)

查看其它3个回答