I'm trying to work with Stanford POS tagger within NLTK. I'm using the example shown here:
http://www.nltk.org/api/nltk.tag.html#module-nltk.tag.stanford
I'm able to load everything smoothly:
>>> import os
>>> from nltk.tag import StanfordPOSTagger
>>> os.environ['STANFORD_MODELS'] = '/path/to/stanford/folder/models')
>>> st = StanfordPOSTagger('english-bidirectional-distsim.tagger',path_to_jar='/path/to/stanford/folder/stanford-postagger.jar')
but at the first execution:
>>> st.tag('What is the airspeed of an unladen swallow ?'.split())
it gives me the following error:
Loading default properties from tagger /path/to/stanford/folder/models/english-bidirectional-distsim.tagger
Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
at edu.stanford.nlp.io.IOUtils.<clinit>(IOUtils.java:41)
at edu.stanford.nlp.tagger.maxent.TaggerConfig.<init>(TaggerConfig.java:146)
at edu.stanford.nlp.tagger.maxent.TaggerConfig.<init>(TaggerConfig.java:128)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.main(MaxentTagger.java:1836)
Caused by: java.lang.ClassNotFoundException: org.slf4j.LoggerFactory
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 4 more
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/miguelwon/anaconda/lib/python2.7/site-packages/nltk/tag/stanford.py", line 66, in tag
return sum(self.tag_sents([tokens]), [])
File "/Users/miguelwon/anaconda/lib/python2.7/site-packages/nltk/tag/stanford.py", line 89, in tag_sents
stdout=PIPE, stderr=PIPE)
File "/Users/miguelwon/anaconda/lib/python2.7/site-packages/nltk/internals.py", line 134, in java
raise OSError('Java command failed : ' + str(cmd))
OSError: Java command failed : [u'/usr/bin/java', '-mx1000m', '-cp', '/path/to/stanford/folder/stanford-postagger-full-2015-12-09/stanford-postagger.jar', 'edu.stanford.nlp.tagger.maxent.MaxentTagger', '-model', '/Users/miguelwon/Documents/Kaggel/RTE/stanford-postagger-full-2015-12-09/models/english-bidirectional-distsim.tagger', '-textFile', '/var/folders/vb/dy__dnps7qz35slpmfkc25g40000gn/T/tmpwieb0M', '-tokenize', 'false', '-outputFormatOptions', 'keepEmptySentences', '-encoding', 'utf8']
Lot has changed since this solution.Here is my solution to the code,after I too faced the error.Basically increasing JAVA heapsize solved it.
import os
java_path = "C:\\Program Files\\Java\\jdk1.8.0_102\\bin\\java.exe"
os.environ['JAVAHOME'] = java_path
from nltk.tag.stanford import StanfordPOSTagger
path_to_model = "stanford-postagger-2015-12-09/models/english-bidirectional-distsim.tagger"
path_to_jar = "stanford-postagger-2015-12-09/stanford-postagger.jar"
tagger=StanfordPOSTagger(path_to_model, path_to_jar)
tagger.java_options='-mx4096m' ### Setting higher memory limit for long sentences
sentence = 'This is testing'
print tagger.tag(sentence.split())
The best thing to do is simply to download the latest version of the Stanford POS tagger where the dependency problem is now fixed (March 2018).
wget https://nlp.stanford.edu/software/stanford-postagger-full-2017-06-09.zip
来源:https://stackoverflow.com/questions/34692987/cant-make-stanford-pos-tagger-working-in-nltk