Pandas Use Value if Not Null, Else Use Value From Next Column

后端 未结 4 1338
猫巷女王i
猫巷女王i 2020-12-25 12:23

Given the following dataframe:

import pandas as pd
df = pd.DataFrame({\'COL1\': [\'A\', np.nan,\'A\'], 
                   \'COL2\' : [np.nan,\'A\',\'A\']})
         


        
4条回答
  •  夕颜
    夕颜 (楼主)
    2020-12-25 13:04

    If we mod your df slightly then you will see that this works and in fact will work for any number of columns so long as there is a single valid value:

    In [5]:
    df = pd.DataFrame({'COL1': ['B', np.nan,'B'], 
                       'COL2' : [np.nan,'A','A']})
    df
    
    Out[5]:
      COL1 COL2
    0    B  NaN
    1  NaN    A
    2    B    A
    
    In [6]:    
    df.apply(lambda x: x[x.first_valid_index()], axis=1)
    
    Out[6]:
    0    B
    1    A
    2    B
    dtype: object
    

    first_valid_index will return the index value (in this case column) that contains the first non-NaN value:

    In [7]:
    df.apply(lambda x: x.first_valid_index(), axis=1)
    
    Out[7]:
    0    COL1
    1    COL2
    2    COL1
    dtype: object
    

    So we can use this to index into the series

提交回复
热议问题