Imputation of missing values for categories in pandas

后端未结

关注

 4  550

时光说笑 2020-12-04 18:03

The question is how to fill NaNs with most frequent levels for category column in pandas dataframe?

In R randomForest package there is na.roughfix option : A


      
      
        
          4条回答        

        
                    
            
            
                         
                
              
              
                
                   孤城傲影
                                             
                
                
                (楼主)
            
              
              
                2020-12-04 18:25
              

            
            
                        
You can use df = df.fillna(df['Label'].value_counts().index[0]) to fill NaNs with the most frequent value from one column. 

If you want to fill every column with its own most frequent value you can use 

df = df.apply(lambda x:x.fillna(x.value_counts().index[0]))

UPDATE 2018-25-10 ⬇

Starting from 0.13.1 pandas includes mode method for Series and Dataframes.
You can use it to fill missing values for each column (using its own most frequent value) like this

df = df.fillna(df.mode().iloc[0])

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它4个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复