Scikit-Learn's Pipeline: A sparse matrix was passed, but dense data is required

后端未结

关注

 5  2116

傲寒 2020-12-07 19:04

I\'m finding it difficult to understand how to fix a Pipeline I created (read: largely pasted from a tutorial). It\'s python 3.4.2:

df = pd.DataFrame
df = Da


      
      
        
          5条回答        

        
                    
            
            
                         
                
              
              
                
                   [愿得一人]
                                             
                
                
                (楼主)
            
              
              
                2020-12-07 19:35
              

            
            
                        
you can change pandas Series to arrays using the .values method.

pipeline.fit(df[0].values, df[1].values)


However I think the issue here happens because CountVectorizer() returns a sparse matrix by default, and cannot be piped to the RF classifier. CountVectorizer() does have a dtype parameter to specify the type of array returned. That said usually you need to do some sort of dimensionality reduction to use random forests for text classification, because bag of words feature vectors are very long
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它5个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复