nltk StanfordNERTagger : NoClassDefFoundError: org/slf4j/LoggerFactory (In Windows)

后端 未结 9 2061
轮回少年
轮回少年 2020-12-15 09:43

NOTE: I am using Python 2.7 as part of Anaconda distribution. I hope this is not a problem for nltk 3.1.

I am trying to use nltk for NER as

import nl         


        
相关标签:
9条回答
  • 2020-12-15 10:19

    EDITED

    Note: The following answer will only work on:

    • NLTK version 3.1
    • Stanford Tools compiled since 2015-04-20

    As both tools changes rather quickly and the API might look very different 3-6 months later. Please treat the following answer as temporal and not an eternal fix.

    Always refer to https://github.com/nltk/nltk/wiki/Installing-Third-Party-Software for the latest instruction on how to interface Stanford NLP tools using NLTK!!


    Step 1

    First update your NLTK to the version 3.1 using

    pip install -U nltk
    

    or (for Windows) download the latest NLTK using http://pypi.python.org/pypi/nltk

    Then check that you have version 3.1 using:

    python3 -c "import nltk; print(nltk.__version__)"
    

    Step 2

    Then download the zip file from http://nlp.stanford.edu/software/stanford-ner-2015-04-20.zip and unzip the file and save to C:\some\path\to\stanford-ner\ (In windows)

    Step 3

    Then set the environment variable for CLASSPATH to C:\some\path\to\stanford-ner\stanford-ner.jar

    and the environment variable for STANFORD_MODELS to C:\some\path\to\stanford-ner\classifiers

    Or in command line (ONLY for Windows):

    set CLASSPATH=%CLASSPATH%;C:\some\path\to\stanford-ner\stanford-ner.jar
    set STANFORD_MODELS=%STANFORD_MODELS%;C:\some\path\to\stanford-ner\classifiers
    

    (See https://stackoverflow.com/a/17176423/610569 for click-click GUI instructions for setting environment variables in Windows)

    (See Stanford Parser and NLTK for details on setting environment variables in Linux)

    Step 4

    Then in python:

    >>> from nltk.tag import StanfordNERTagger
    >>> st = StanfordNERTagger('english.all.3class.distsim.crf.ser.gz') 
    >>> st.tag('Rami Eid is studying at Stony Brook University in NY'.split())
    [(u'Rami', u'PERSON'), (u'Eid', u'PERSON'), (u'is', u'O'), (u'studying', u'O'), (u'at', u'O'), (u'Stony', u'ORGANIZATION'), (u'Brook', u'ORGANIZATION'), (u'University', u'ORGANIZATION'), (u'in', u'O'), (u'NY', u'O')]
    

    Without setting the environment variables, you can try:

    from nltk.tag import StanfordNERTagger
    
    stanford_ner_dir = 'C:\\some\path\to\stanford-ner\'
    eng_model_filename= stanford_ner_dir + 'classifiers\english.all.3class.distsim.crf.ser.gz'
    my_path_to_jar= stanford_ner_dir + 'stanford-ner.jar'
    
    st = StanfordNERTagger(model_filename=eng_model_filename, path_to_jar=my_path_to_jar) 
    st.tag('Rami Eid is studying at Stony Brook University in NY'.split())
    

    See more detailed instructions on Stanford Parser and NLTK

    0 讨论(0)
  • 2020-12-15 10:21

    According to me the java environment is not set for python in your code.

    You could do that by using the following code:

    from nltk.tag.stanford import NERTagger
    import os
    java_path = "/Java/jdk1.8.0_45/bin/java.exe"
    os.environ['JAVAHOME'] = java_path
    st = NERTagger('../ner-model.ser.gz','../stanford-ner.jar')
    tagging = st.tag(text.split())   
    

    Check if this solves your problem.

    0 讨论(0)
  • 2020-12-15 10:21

    The best thing to do is simply to download the latest version of the Stanford NER tagger where the dependency problem is now fixed (March 2018).

    wget https://nlp.stanford.edu/software/stanford-ner-2018-02-27.zip
    
    0 讨论(0)
  • 2020-12-15 10:27

    For those who want to use Stanford NER >= 3.6.0 instead of the 2015-01-30 (3.5.1) or other old version, do this instead:

    1. Put the stanford-ner.jar and slf4j-api.jar into the same folder

      For example, I put the following files to /path-to-libs/

      • stanford-ner-3.6.0.jar
      • slf4j-api-1.7.18.jar
    2. Then:

      classpath = "/path-to-libs/*"
      
      st = nltk.tag.StanfordNERTagger(
          "/path-to-model/ner-model.ser.gz",
          "/path-to-libs/stanford-ner-3.6.0.jar"
      )
      st._stanford_jar = classpath
      result = st.tag(["Hello"])
      
    0 讨论(0)
  • 2020-12-15 10:29

    I encountered exactly the same problem as you described yesterday.

    There are 3 things you need to do.

    1) Update your NLTK.

    pip install -U nltk
    

    Your version should be >3.1 and I see you are using

    from nltk.tag.stanford import StanfordNERTagger
    

    However, you gotta use the new module:

    from nltk.tag import StanfordNERTagger
    

    2) Download slf4j and update your CLASSPATH.

    Here is how you update your CLASSPATH.

    javapath = "/Users/aerin/Downloads/stanford-ner-2014-06-16/stanford-ner.jar:/Users/aerin/java/slf4j-1.7.13/slf4j-log4j12-1.7.13.jar"
    os.environ['CLASSPATH'] = javapath 
    

    As you see above, the javapath contains 2 paths, one is where stanford-ner.jar is, the other is where you downloaded slf4j-log4j12-1.7.13.jar (It can be downloaded here: http://www.slf4j.org/download.html)

    3) Don't forget to specify where you downloaded 'english.all.3class.distsim.crf.ser.gz' & 'stanford-ner.jar'

    st = StanfordNERTagger('/Users/aerin/Downloads/stanford-ner-2014-06-16/classifiers/english.all.3class.distsim.crf.ser.gz','/Users/aerin/Downloads/stanford-ner-2014-06-16/stanford-ner.jar') 
    
    st.tag("Doneyo lab did such an awesome job!".split())
    
    0 讨论(0)
  • 2020-12-15 10:35

    Current Stanford NER tagger version is not compatible with nltk because it requires additional jars that nltk cannot add to the CLASSPATH.

    Instead prefer an older version of Stanford NER Tagger that will works perfectly fine like this one: http://nlp.stanford.edu/software/stanford-ner-2015-04-20.zip

    0 讨论(0)
提交回复
热议问题