Resource u'tokenizers/punkt/english.pickle' not found

前端未结

关注

 17  2044

My Code:

import nltk.data
tokenizer = nltk.data.load(\'nltk:tokenizers/punkt/english.pickle\')

ERROR Message:

[ec2-user@ip-


                      
              相关标签:


      
      
        
          17条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  一整个雨季        
                
              
                            
                2020-12-13 02:29
              
            
            
                                                                       
After adding this line of code, the issue will be fixed:

nltk.download('punkt')

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  粉色の甜心        
                
              
                            
                2020-12-13 02:31
              
            
            
                                                                       
Simple nltk.download() will not solve this issue. I tried the below and it worked for me:

in the nltk folder create a tokenizers folder and copy your punkt folder into tokenizers folder.

This will work.!
the folder structure needs to be as shown in the picture
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  無奈伤痛        
                
              
                            
                2020-12-13 02:32
              
            
            
                                                                       

Execute the following code:

import nltk
nltk.download()

After this, NLTK downloader will pop out.
Select All packages.
Download punkt.

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  野性不改        
                
              
                            
                2020-12-13 02:38
              
            
            
                                                                       
From the shell you can execute:

sudo python -m nltk.downloader punkt 


If you want to install the popular NLTK corpora/models:

sudo python -m nltk.downloader popular


If you want to install all NLTK corpora/models:

sudo python -m nltk.downloader all


To list the resources you have downloaded:

python -c 'import os; import nltk; print os.listdir(nltk.data.find("corpora"))'
python -c 'import os; import nltk; print os.listdir(nltk.data.find("tokenizers"))'

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  别那么骄傲        
                
              
                            
                2020-12-13 02:38
              
            
            
                                                                       
import nltk
nltk.download('punkt')


Open the Python prompt and run the above statements.

The sent_tokenize function uses an instance of PunktSentenceTokenizer from the
nltk.tokenize.punkt module. This instance has already been trained and works well for
many European languages. So it knows what punctuation and characters mark the end of a
sentence and the beginning of a new sentence.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
3
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复