Resource u'tokenizers/punkt/english.pickle' not found

前端未结

关注

 17  2088

My Code:

import nltk.data
tokenizer = nltk.data.load(\'nltk:tokenizers/punkt/english.pickle\')

ERROR Message:

[ec2-user@ip-


                      
              相关标签:


      
      
        
          17条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  情歌与酒        
                
              
                            
                2020-12-13 02:21
              
            
            
                                                                       
I got the solution:

import nltk
nltk.download()


once the NLTK Downloader starts

    d) Download   l) List    u) Update   c) Config   h) Help   q) Quit

Downloader> d

Download which package (l=list; x=cancel)?
  Identifier> punkt
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤独总比滥情好        
                
              
                            
                2020-12-13 02:22
              
            
            
                                                                       
For me it got solved by using "nltk:"
http://www.nltk.org/howto/data.html
Failed loading english.pickle with nltk.data.load
sent_tokenizer=nltk.data.load('nltk:tokenizers/punkt/english.pickle')

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  粉色の甜心        
                
              
                            
                2020-12-13 02:23
              
            
            
                                                                       
The same thing happened to me recently, you just need to download the "punkt" package and it should work.

When you execute "list" (l) after having "downloaded all the available things", is everything marked like the following line?:  

[*] punkt............... Punkt Tokenizer Models


If you see this line with the star, it means you have it, and nltk should be able to load it.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤独总比滥情好        
                
              
                            
                2020-12-13 02:23
              
            
            
                                                                       
I was getting an error despite importing the following, 

import nltk
nltk.download()


but for google colab this solved my issue. 

   !python3 -c "import nltk; nltk.download('all')"

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  执念已碎        
                
              
                            
                2020-12-13 02:25
              
            
            
                                                                       
My issue was that I called nltk.download('all') as the root user, but the process that eventually used nltk was another user who didn't have access to /root/nltk_data where the content was downloaded.  

So I simply recursively copied everything from the download location to one of the paths where NLTK was looking to find it like this:

cp -R /root/nltk_data/ /home/ubuntu/nltk_data

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  走了就别回头了        
                
              
                            
                2020-12-13 02:27
              
            
            
                                                                       
Just make sure you are using Jupyter Notebook and in a notebook, do the following:

import nltk

nltk.download()


Then one popup window will appear (showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml) 
From that you have to download everything.

Then rerun your code.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
3
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复