R tm package invalid input in 'utf8towcs'

前端未结

关注

 14  1421

逝去的感伤 2020-11-29 01:47

I\'m trying to use the tm package in R to perform some text analysis. I tied the following:

require(tm)
dataSet <- Corpus(DirSource(\'tmp/\'))
dataSet <


      
      
        
          14条回答        

        
                    
            
            
                         
                
              
              
                
                   情书的邮戳
                                             
                
                
                (楼主)
            
              
              
                2020-11-29 02:07
              

            
            
                        
I have been running this on Mac and to my frustration,I had to identify the foul record (as these were tweets) to resolve. Since the next time, there is no guarantee of the record being the same, I used the following function

tm_map(yourCorpus, function(x) iconv(x, to='UTF-8-MAC', sub='byte'))


as suggested above.

It worked like a charm
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它14个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复