Pandas Use Value if Not Null, Else Use Value From Next Column

后端未结

关注

 4  1327

Given the following dataframe:

import pandas as pd
df = pd.DataFrame({\'COL1\': [\'A\', np.nan,\'A\'], 
                   \'COL2\' : [np.nan,\'A\',\'A\']})


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  北恋        
                
              
                            
                2020-12-25 12:50
              
            
            
                                                                       
In [8]: df
Out[8]:
  COL1 COL2
0    A  NaN
1  NaN    B
2    A    B

In [9]: df["COL3"] = df["COL1"].fillna(df["COL2"])

In [10]: df
Out[10]:
  COL1 COL2 COL3
0    A  NaN    A
1  NaN    B    B
2    A    B    A

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  野的像风        
                
              
                            
                2020-12-25 12:58
              
            
            
                                                                       
You can use np.where to conditionally set column values.

df = df.assign(COL3=np.where(df.COL1.isnull(), df.COL2, df.COL1))

>>> df
  COL1 COL2 COL3
0    A  NaN    A
1  NaN    A    A
2    A    A    A


If you don't mind mutating the values in COL2, you can update them directly to get your desired result.

df = pd.DataFrame({'COL1': ['A', np.nan,'A'], 
                   'COL2' : [np.nan,'B','B']})

>>> df
  COL1 COL2
0    A  NaN
1  NaN    B
2    A    B

df.COL2.update(df.COL1)

>>> df
  COL1 COL2
0    A    A
1  NaN    B
2    A    A

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  夕颜        
                
              
                            
                2020-12-25 13:04
              
            
            
                                                                       
If we mod your df slightly then you will see that this works and in fact will work for any number of columns so long as there is a single valid value:

In [5]:
df = pd.DataFrame({'COL1': ['B', np.nan,'B'], 
                   'COL2' : [np.nan,'A','A']})
df

Out[5]:
  COL1 COL2
0    B  NaN
1  NaN    A
2    B    A

In [6]:    
df.apply(lambda x: x[x.first_valid_index()], axis=1)

Out[6]:
0    B
1    A
2    B
dtype: object


first_valid_index will return the index value (in this case column) that contains the first non-NaN value:

In [7]:
df.apply(lambda x: x.first_valid_index(), axis=1)

Out[7]:
0    COL1
1    COL2
2    COL1
dtype: object


So we can use this to index into the series
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  醉话见心        
                
              
                            
                2020-12-25 13:16
              
            
            
                                                                       
Using .combine_first, which gives precedence to non-null values in the Series or DataFrame calling it:

import pandas as pd
import numpy as np

df = pd.DataFrame({'COL1': ['A', np.nan,'A'], 
                   'COL2' : [np.nan,'B','B']})

df['COL3'] = df.COL1.combine_first(df.COL2)


Output:

  COL1 COL2 COL3
0    A  NaN    A
1  NaN    B    B
2    A    B    A

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复