stanford-nlp

Concurrent processing using Stanford CoreNLP (3.5.2)

拥有回忆 提交于 2019-12-19 02:53:13
问题 I'm facing a concurrency problem in annotating multiple sentences simultaneously. It's unclear to me whether I'm doing something wrong or maybe there is a bug in CoreNLP. My goal is to annotate sentences with the pipeline "tokenize, ssplit, pos, lemma, ner, parse, dcoref" using several threads running in parallel. Each thread allocates its own instance of StanfordCoreNLP and then uses it for the annotation. The problem is that at some point an exception is thrown: java.util

How to NER and POS tag a pre-tokenized text with Stanford CoreNLP?

陌路散爱 提交于 2019-12-18 17:33:11
问题 I'm using the Stanford's CoreNLP Named Entity Recognizer (NER) and Part-of-Speech (POS) tagger in my application. The problem is that my code tokenizes the text beforehand and then I need to NER and POS tag each token. However I was only able to find out how to do that using the command line options but not programmatically. Can someone please tell me how programmatically can I NER and POS tag pretokenized text using Stanford's CoreNLP? Edit: I'm actually using the individual NER and POS

Parse out phrasal verbs

删除回忆录丶 提交于 2019-12-18 17:03:32
问题 Has anyone ever tried parsing out phrasal verbs with Stanford NLP? The problem is with separable phrasal verbs, e.g.: climb up, do over: We climbed that hill up. I have to do this job over. The first phrase looks like this in the parse tree: (VP (VBD climbed) (ADVP (IN that) (NP (NN hill) ) ) (ADVP (RB up) ) ) the second phrase: (VB do) (NP (DT this) (NN job) ) (PP (IN over) ) So it seems like reading the parse tree would be the right way, but how to know that verb is going to be phrasal? 回答1

nltk StanfordNERTagger : How to get proper nouns without capitalization

邮差的信 提交于 2019-12-18 13:35:08
问题 I am trying to use the StanfordNERTagger and nltk to extract keywords from a piece of text. docText="John Donk works for POI. Brian Jones wants to meet with Xyz Corp. for measuring POI's Short Term performance Metrics." words = re.split("\W+",docText) stops = set(stopwords.words("english")) #remove stop words from the list words = [w for w in words if w not in stops and len(w) > 2] str = " ".join(words) print str stn = StanfordNERTagger('english.all.3class.distsim.crf.ser.gz') stp =

finding noun and verb in stanford parser

Deadly 提交于 2019-12-18 12:34:53
问题 I need to find whether a word is verb or noun or it is both For example, the word is "search" it can be both noun and a verb but stanford parser gives NN tag to it.. is there any way that stanford parser will give that "search" is both noun and verb? code that i use now public static String Lemmatize(String word) { WordTag w = new WordTag(word); w.setTag(POSTagWord(word)); Morphology m = new Morphology(); WordLemmaTag wT = m.lemmatize(w); return wT.lemma(); } or should i use any other

Parse sentence Stanford Parser by passing String not an array of strings

梦想的初衷 提交于 2019-12-18 09:29:21
问题 Is it possible to parse a sentence using the Stanford Parser by passing a string and not an array of strings. This is the example they gave in their short tutorial (See Docs) : Here's example: import java.util.*; import edu.stanford.nlp.ling.*; import edu.stanford.nlp.trees.*; import edu.stanford.nlp.parser.lexparser.LexicalizedParser; class ParserDemo { public static void main(String[] args) { LexicalizedParser lp = LexicalizedParser.loadModel("edu/stanford/nlp/models/lexparser/englishPCFG

Finding Tense of A sentence using stanford nlp

半腔热情 提交于 2019-12-18 05:54:00
问题 Q1.I am trying to get tense of a complete sentence,just don't know how to do it using nlp. Any help appreciated. Q2 .What all information can be extracted from a sentence using nlp? Currently I can, I get : 1.Voice of sentence 2.subject object verb 3.POS tags. Any more info can be extracted please let me know. 回答1: The Penn treebank defines VBD and VBN as the past tense and the past participle of a verb, respectively. In many sentences, simply getting the POS tags and checking for the

Finding Tense of A sentence using stanford nlp

笑着哭i 提交于 2019-12-18 05:52:20
问题 Q1.I am trying to get tense of a complete sentence,just don't know how to do it using nlp. Any help appreciated. Q2 .What all information can be extracted from a sentence using nlp? Currently I can, I get : 1.Voice of sentence 2.subject object verb 3.POS tags. Any more info can be extracted please let me know. 回答1: The Penn treebank defines VBD and VBN as the past tense and the past participle of a verb, respectively. In many sentences, simply getting the POS tags and checking for the

Stanford Dependency Parser Setup and NLTK

烈酒焚心 提交于 2019-12-18 03:43:29
问题 So I got the "standard" Stanford Parser to work thanks to danger89's answers to this previous post, Stanford Parser and NLTK. However, I am now trying to get the dependency parser to work and it seems the method highlighted in the previous link no longer works. Here is my code: import nltk import os java_path = "C:\\Program Files\\Java\\jre1.8.0_51\\bin\\java.exe" os.environ['JAVAHOME'] = java_path from nltk.parse import stanford os.environ['STANFORD_PARSER'] = 'path/jar' os.environ['STANFORD

nltk StanfordNERTagger : NoClassDefFoundError: org/slf4j/LoggerFactory (In Windows)

為{幸葍}努か 提交于 2019-12-18 02:47:47
问题 NOTE: I am using Python 2.7 as part of Anaconda distribution. I hope this is not a problem for nltk 3.1. I am trying to use nltk for NER as import nltk from nltk.tag.stanford import StanfordNERTagger #st = StanfordNERTagger('stanford-ner/all.3class.distsim.crf.ser.gz', 'stanford-ner/stanford-ner.jar') st = StanfordNERTagger('english.all.3class.distsim.crf.ser.gz') print st.tag(str) but i get Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory at edu.stanford.nlp