nltk StanfordNERTagger : NoClassDefFoundError: org/slf4j/LoggerFactory (In Windows)

后端未结

关注

 9  2062

NOTE: I am using Python 2.7 as part of Anaconda distribution. I hope this is not a problem for nltk 3.1.

I am trying to use nltk for NER as

import nl


                      
              相关标签:


      
      
        
          9条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  被撕碎了的回忆        
                
              
                            
                2020-12-15 10:36
              
            
            
                                                                       
i fixed!

u should indicate the full path of slf4j-api.jar in CLASSPATH

instead of add jar-path into system environment variable, u can do like this in code:

_CLASS_PATH = "."    
if os.environ.get('CLASSPATH') is not None:
    _CLASS_PATH = os.environ.get('CLASSPATH')
os.environ['CLASSPATH'] = _CLASS_PATH + ';F:\Python\Lib\slf4j\slf4j-api-1.7.13.jar'


important, in nltk/*/stanford.py will reset the classpath like this:

stdout, stderr = java(cmd, classpath=self._stanford_jar, stdout=PIPE, stderr=PIPE)


eg. \Python34\Lib\site-packages\nltk\tokenize\stanford.py line:90

u can fix it like this:

_CLASS_PATH = "."
if os.environ.get('CLASSPATH') is not None:
    _CLASS_PATH = os.environ.get('CLASSPATH')
stdout, stderr = java(cmd, classpath=(self._stanford_jar, _CLASS_PATH), stdout=PIPE, stderr=PIPE)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤街浪徒        
                
              
                            
                2020-12-15 10:37
              
            
            
                                                                       
NOTE:

Below is a temporal hack to work with:


NLTK version 3.1
Stanford NER compiled on 2015-12-09


This solution is NOT meant to be an eternal solution. 

Always refer to https://github.com/nltk/nltk/wiki/Installing-Third-Party-Software for the latest instruction on how to interface Stanford NLP tools using NLTK!!

Please track updates on this issue if you do not want to use this "hack": https://github.com/nltk/nltk/issues/1237 or please use the NER tool compield on 2015-04-20.



In Short

Make sure that you have:


NLTK version 3.1
Stanford NER compiled on 2015-12-09
Set the environment variables for CLASSPATH and STANFORD_MODELS


To set environment variables in Windows:

set CLASSPATH=%CLASSPATH%;C:\some\path\to\stanford-ner\stanford-ner.jar
set STANFORD_MODELS=%STANFORD_MODELS%;C:\some\path\to\stanford-ner\classifiers


To set environment variables in Linux:

export STANFORDTOOLSDIR=/home/some/path/to/stanfordtools/
export CLASSPATH=$STANFORDTOOLSDIR/stanford-ner-2015-12-09/stanford-ner.jar
export STANFORD_MODELS=$STANFORDTOOLSDIR/stanford-ner-2015-12-09/classifiers


Then:

>>> from nltk.internals import find_jars_within_path
>>> from nltk.tag import StanfordNERTagger
>>> st = StanfordNERTagger('english.all.3class.distsim.crf.ser.gz') 
# Note this is where your stanford_jar is saved.
# We are accessing the environment variables you've 
# set through the NLTK API.
>>> print st._stanford_jar
/home/alvas/stanford-ner-2015-12-09/stanford-ner.jar
>>> stanford_dir = st._stanford_jar.rpartition("\\")[0] # windows
# Note in linux you do this instead: 
>>> stanford_dir = st._stanford_jar.rpartition('/')[0] # linux
# Use the `find_jars_within_path` function to get all the
# jar files out from stanford NER tool under the libs/ dir.
>>> stanford_jars = find_jars_within_path(stanford_dir)
# Put the jars back into the `stanford_jar` classpath.
>>> st._stanford_jar = ':'.join(stanford_jars) # linux
>>> st._stanford_jar = ';'.join(stanford_jars) # windows
>>> st.tag('Rami Eid is studying at Stony Brook University in NY'.split())
[(u'Rami', u'PERSON'), (u'Eid', u'PERSON'), (u'is', u'O'), (u'studying', u'O'), (u'at', u'O'), (u'Stony', u'ORGANIZATION'), (u'Brook', u'ORGANIZATION'), (u'University', u'ORGANIZATION'), (u'in', u'O'), (u'NY', u'O')]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  耶瑟儿～        
                
              
                            
                2020-12-15 10:45
              
            
            
                                                                       
I think the issue is with how slf4j has been used. 

I am on nltk 3.1 and using stanford-parser-full-2015-12-09. I only way I could get it to work was to modify /Library/Python/2.7/site-packages/nltk/parse/stanford.py and add the slf4j jar to self._classpath within init method.

That solved it. Crude - but - works.

Note - I was not trying NER exactly. I was trying something like below

import os
from nltk.parse import stanford
os.environ['STANFORD_PARSER'] = '/Users/run2/stanford-parser-full-2015-12-09'
os.environ['STANFORD_MODELS'] = '/Users/run2/stanford-parser-full-2015-12-09'
parser = stanford.StanfordParser(model_path='/Users/run2/stanford-parser-full-2015-12-09/englishPCFG.ser.gz')
sentences = parser.raw_parse_sents('<some sentence from my corpus>')

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复