UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples

后端未结

关注

 6  2077

I\'m getting this weird error:

classification.py:1113: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted sample


                      
              相关标签:


      
      
        
          6条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  陌清茗        
                
              
                            
                2020-11-28 04:34
              
            
            
                                                                       
According to @Shovalt's answer, but in short:
Alternatively you could use the following lines of code
    from sklearn.metrics import f1_score
    metrics.f1_score(y_test, y_pred, labels=np.unique(y_pred))

This should remove your warning and give you the result you wanted, because it no longer considers the difference between the sets, by using the unique mode.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  不思量自难忘°        
                
              
                            
                2020-11-28 04:45
              
            
            
                                                                       
As the error message states, the method used to get the F score is from the "Classification" part of sklearn - thus the talking about "labels".

Do you have a regression problem? Sklearn provides a "F score" method for regression under the "feature selection" group: http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.f_regression.html

In case you do have a classification problem, @Shovalt's answer seems correct to me.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  猫巷女王i        
                
              
                            
                2020-11-28 04:46
              
            
            
                                                                       
As mentioned in the comments, some labels in y_true don't appear in y_pred. Specifically in this case, label '2' is never predicted:

>>> set(y_test) - set(y_pred)
{2}


This means that there is no F-score to calculate for this label, and thus the F-score for this case is considered to be 0.0. Since you requested an average of the score, you must take into account that a score of 0 was included in the calculation, and this is why scikit-learn is showing you that warning.

This brings me to you not seeing the error a second time. As I mentioned, this is a warning, which is treated differently from an error in python. The default behavior in most environments is to show a specific warning only once. This behavior can be changed:

import warnings
warnings.filterwarnings('always')  # "error", "ignore", "always", "default", "module" or "once"


If you set this before importing the other modules, you will see the warning every time you run the code. 

There is no way to avoid seeing this warning the first time, aside for setting warnings.filterwarnings('ignore'). What you can do, is decide that you are not interested in the scores of labels that were not predicted, and then explicitly specify the labels you are interested in (which are labels that were predicted at least once):

>>> metrics.f1_score(y_test, y_pred, average='weighted', labels=np.unique(y_pred))
0.91076923076923078


The warning is not shown in this case.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  滥情空心        
                
              
                            
                2020-11-28 04:46
              
            
            
                                                                       
the same problem also happened to me when i training my classification model.
the reason caused this problem is as what the warning message said "in labels with no predicated samples", it will caused the zero-division when compute f1-score.
I found another solution when i read sklearn.metrics.f1_score doc, there is a note as follows:


  When true positive + false positive == 0, precision is undefined; When true positive + false negative == 0, recall is undefined. In such cases, by default the metric will be set to 0, as will f-score, and UndefinedMetricWarning will be raised. This behavior can be modified with zero_division


the zero_division default value is "warn", you could set it to 0 or 1 to avoid UndefinedMetricWarning. 
it works for me ;) oh wait, there is another problem when i using zero_division, my sklearn report that no such keyword argument by using scikit-learn 0.21.3. Just update your sklearn to the latest version by running pip install scikit-learn -U
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  我寻月下人不归        
                
              
                            
                2020-11-28 04:50
              
            
            
                                                                       
As I have noticed this error occurs under two circumstances, 


If you have used train_test_split() to split your data, you have to make sure that you reset the index of the data (specially when taken using a pandas series object): y_train, y_test indices should be resetted. The problem is when you try to use one of the scores from sklearn.metrics such as; precision_score, this will try to match the shuffled indices of the y_test that you got from train_test_split(). 


so use, either np.array(y_test) for y_true in scores or y_test.reset_index(drop=True)


Then again you can still have this error if your predicted 'True Positives' is 0, which is used for precision, recall and f1_scores. You can visualize this using a confusion_matrix. If the classification is multilabel and you set param: average='weighted'/micro/macro you will get an answer as long as the diagonal line in the matrix is not 0


Hope this helps.   
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  终归单人心        
                
              
                            
                2020-11-28 04:52
              
            
            
                                                                       
The accepted answer explains already well why the warning occurs. If you simply want to control the warnings, one could use precision_recall_fscore_support. It offers a (semi-official) argument warn_for that could be used to mute the warnings. 

(_, _, f1, _) = metrics.precision_recall_fscore_support(y_test, y_pred,
                                                        average='weighted', 
                                                        warn_for=tuple())


As mentioned already in some comments, use this with care.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复