stanford-nlp | 易学教程

Stanford NLP: How to lemmatize single word?

阅读更多关于 Stanford NLP: How to lemmatize single word?

问题 I know how I can annotate a sentence and get the lemma of each word but I don't know how to do it if I just want to lemmatize a single word. I tried Annotation tokenAnnotation = new Annotation("wedding"); List<CoreMap> list = tokenAnnotation.get(SentencesAnnotation.class); String tokenLemma = list .get(0).get(TokensAnnotation.class) .get(0).get(LemmaAnnotation.class); but the tokenAnnotation has only one TextAnnotation key which means list will be null here. So how can I lemmatize a single

Stanford NLP: How to lemmatize single word?

阅读更多关于 Stanford NLP: How to lemmatize single word?

NLTK: why does nltk not recognize the CLASSPATH variable for stanford-ner?

阅读更多关于 NLTK: why does nltk not recognize the CLASSPATH variable for stanford-ner?

问题 This is my code from nltk.tag import StanfordNERTagger st = StanfordNERTagger('english.all.3class.distsim.crf.ser.gz') And i get NLTK was unable to find stanford-ner.jar! Set the CLASSPATH environment variable. This is what my .bashrc looks like in ubuntu export CLASSPATH=/home/wolfgang/Downloads/stanford-ner-2015-04-20/stanford-ner-3.5.2.jar export STANFORD_MODELS=/home/wolfgang/Downloads/stanford-ner-2015-04-20/classifiers Also, i tried printing the environmental variable in python this way

Dates when using StanfordCoreNLP pipeline

阅读更多关于 Dates when using StanfordCoreNLP pipeline

问题 If I create an AnnotationPipeline with a TokenizerAnnotator, WordsToSentencesAnnotator, POSTaggerAnnotator, and sutime, I get TimexAnnotations attached to the resulting annotation. But if I create a StanfordCoreNLP pipeline with the "annotators" property set to "tokenize, ssplit, pos, lemma, ner", I don't get TimexAnnotations even though the relevant individual tokens are NER-tagged as DATE. Why is there this difference? 回答1: When I run this command: java -Xmx8g edu.stanford.nlp.pipeline

How to speedup Stanford NLP in Python?

阅读更多关于 How to speedup Stanford NLP in Python?

问题 import numpy as np from nltk.tag import StanfordNERTagger from nltk.tokenize import word_tokenize #english.all.3class.distsim.crf.ser.gz st = StanfordNERTagger('/media/sf_codebase/modules/stanford-ner-2018-10-16/classifiers/english.all.3class.distsim.crf.ser.gz', '/media/sf_codebase/modules/stanford-ner-2018-10-16/stanford-ner.jar', encoding='utf-8') After initializing above code Stanford NLP following code takes 10 second to tag the text as shown below. How to speed up? %%time text="My name

The import edu.stanford.nlp.pipeline.StanfordCoreNLP cannot be resolved?

阅读更多关于 The import edu.stanford.nlp.pipeline.StanfordCoreNLP cannot be resolved?

问题 I am naive to StanfordNlpTagger. I have downloaded the JAR files form http://nlp.stanford.edu/software/corenlp.shtml#Download. I have include the four jar files Satnford-postagger.jar Stanford-psotagger-3.3.1.jar Stanford-psotagger-3.3.1.jar-javadoc.jar Stanford-psotagger-3.3.1.jar-src.jar The main problem is that I am not able to "import the edu.stanford.nlp.pipeline.StanfordCoreNLP". Can anyone please suggest me if any more jar files have to included? Thank you. 回答1: That class is in a

How to get Stanford CoreNLP to use a training model you created?

阅读更多关于 How to get Stanford CoreNLP to use a training model you created?

问题 I just created a training model for stanfordCoreNLP, so I have a bunch of files that look like this: Now, how do I tell CoreNLP to use the model I created and not the models that come with coreNLP? Is it something I pass in the command line or something in my java code like: props.put("sentiment.model"); I noticed there's a jar file in my coreNLP library called stanford-corenlp-3.5.1-models.jar. Does this jar file have anything to do with what I want to do? Thank you 回答1: in Java: props.put(

TypeError:DataType float32 for attr 'Tindices' not in list of allowed values: int32, int64

阅读更多关于 TypeError:DataType float32 for attr 'Tindices' not in list of allowed values: int32, int64

问题 I am doing Stanford's CS224n course. I get an error in assignment2 q2_parser_model.py in my dependency parser == Initializing== Loading data... took 2.17 seconds Building parser... took 0.04 seconds Loading pretrained embeddings... took 2.16 seconds Vectorizing data... took 0.06 seconds Preprocessing training data... 1000/1000 [==============================] - 1s Building model... Traceback (most recent call last): File "q2_parser_model.py", line 286, in <module> main() File "q2_parser_model

nlp - How to detect if a word in a sentence is pointing to a color/body part /vehicle

阅读更多关于 nlp - How to detect if a word in a sentence is pointing to a color/body part /vehicle

问题 So as the title suggests I would like to know if a certain word in a sentence is pointing to 1] A color The grass is green. Hence "green" is color 2] A body part Her hands are soft Hence "hands" is a body part 3] A vehicle I am driving my car on the causeway Hence "car" is a vehicle In similar problems, parsers are one of the possible effective solutions. Stanford parser for example was suggested to a similar question How to find if a word in a sentence is pointing to a city Now the problem

Scala - spark-corenlp - java.lang.ClassNotFoundException

阅读更多关于 Scala - spark-corenlp - java.lang.ClassNotFoundException

问题 I want to run spark-coreNLP example, but I get an java.lang.ClassNotFoundException error when running spark-submit. Here is the scala code, from the github example, which I put into an object, and defined a SparkContext. analyzer.Sentiment.scala: package analyzer import org.apache.spark.SparkContext import org.apache.spark.SparkContext._ import org.apache.spark.SparkConf import org.apache.spark.sql.functions._ import com.databricks.spark.corenlp.functions._ import sqlContext.implicits._