stanford-nlp

StanfordNLP classifier out of memory error

大憨熊 提交于 2019-12-11 10:48:00
问题 I'm using StanfordNLP in order to classify some text. It works fine when I use train files with a maximum of 160K lines. But, if I use larger ones, I receive a java.lang.OutOfMemoryError: Java heap space . I'm using the following properties: e.s.n.c.ColumnDataClassifier - Setting ColumnDataClassifier properties e.s.n.c.ColumnDataClassifier - 1.useAllSplitWordTriples = true e.s.n.c.ColumnDataClassifier - useQN = true e.s.n.c.ColumnDataClassifier - encoding = utf-8 e.s.n.c.ColumnDataClassifier

NLP Postagger can't grok imperatives?

假装没事ソ 提交于 2019-12-11 07:13:55
问题 Stanford NLP postagger claims imperative verbs added to recent version. I've inputted lots of text with abundant and obvious imperatives, but there seems to be no tag for them on output. Must one, after all, train it for this pos? 回答1: There is no special tag for imperatives, they are simply tagged as VB . The info on the website refers to the fact that we added a bunch of manually annotated imperative sentences to our training data such that the POS tagger gets more of them right, i.e. tags

Does removing stop words from text affect stanford core nlp NER performance?

时光怂恿深爱的人放手 提交于 2019-12-11 06:26:26
问题 we are trying to implement name entity recognition on millions of comments/feedback and the process appears to be slow. We are thinking of removing stop words/frequent words from the texts and apply ner on them. Does removing stop words affect the accuracy of ner? 回答1: I think it's plausible you will get respectable F1 scores if you run on a sentence with the stop words removed. Ultimately you will have to experiment with it and see if the quality is acceptable for your needs. 来源: https:/

what will be CNF form of this probabilistic grammar?

删除回忆录丶 提交于 2019-12-11 04:52:38
问题 If PCFG is like, NP -> ADJ N [0.6] NP -> N [0.4] N -> cat [0.2] N -> dog [0.8] What will be CNF form? Will it be the following? NP -> ADJ NP [0.6] NP -> cat [0.08] NP -> dog [0.32] or somethings else? 回答1: NP -> ADJ NP [0.6] NP -> cat [0.08] NP -> dog [0.32] Your answer is correct because you need to get the same probability for the result by applying both the original and the converted set of rules (in CNF). 来源: https://stackoverflow.com/questions/39769119/what-will-be-cnf-form-of-this

StanfordCoreNLP differs from StanfordCoreNLPServer

半城伤御伤魂 提交于 2019-12-11 04:43:04
问题 if you run: java -mx3g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -props StanfordCoreNLP-spanish.properties java -mx3g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -props StanfordCoreNLP-spanish.properties The second command open a terminal and Spanish parser works fine, but from the Server version it use the English parser and not the Spanish. ~/CoreNLP/stanford-corenlp-full-2015-12-09# java -mx3g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -props

PrintTree - No head rule defined for MWE - Bug with version 3.5.2

随声附和 提交于 2019-12-11 04:09:20
问题 When I'm trying to print the tree of a sentence I've parsed using the RNN parser, it crashes when there is a MWE dependency in the sentence. It crashes with the latest version of the Stanford NLP (3.5.2), but not with the previous one (3.5.1). Here is the error I get: java.lang.IllegalArgumentException: No head rule defined for MWE using class edu.stanford.nlp.trees.SemanticHeadFinder in (MWE (VBG according) (TO to)) at edu.stanford.nlp.trees.AbstractCollinsHeadFinder.determineNonTrivialHead

distant supervision: how to connect named entities to freebase (KB) relations

风流意气都作罢 提交于 2019-12-11 03:59:17
问题 I'm trying to create a distant supervision corpus. Thus far I've assembled the data, and passed it through an NER system, so you can see an example below. Original data: <p> Myles Brand, the president of the National Collegiate Athletic Association, said in a telephone interview that he had not been approached about whether the N.C.A.A. might oversee a panel for the major bowl games similar to the one that chooses teams for the men's and women's basketball tournaments. </p> Processed with

initCoreNLP() method call from the Stanford's R coreNLP package throws error

☆樱花仙子☆ 提交于 2019-12-11 03:48:54
问题 I am trying to use the coreNLP package. I ran the following commands and encounter the GC overhead limit exceeded error. library(rJava) downloadCoreNLP() initCoreNLP() Error is like this : Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... Error in rJava::.jnew("edu.stanford.nlp.pipeline.StanfordCoreNLP", basename(path)) : java.lang.OutOfMemoryError: GC overhead limit exceeded Error during wrapup: cannot open the connection I don't know much of

Stanford CoreNLP - Exception in thread “main” java.lang.OutOfMemoryError: Java heap space

邮差的信 提交于 2019-12-11 03:38:32
问题 I am trying to run simple program available on this website https://stanfordnlp.github.io/CoreNLP/api.html My Program import java.io.BufferedReader; import java.io.BufferedWriter; import java.io.FileNotFoundException; import java.io.FileReader; import java.io.FileWriter; import java.io.IOException; import java.io.PrintWriter; import java.util.List; import java.util.Properties; import edu.stanford.nlp.ling.CoreAnnotations.NamedEntityTagAnnotation; import edu.stanford.nlp.ling.CoreAnnotations

Activate makeCopulaHead in Stanford CoreNLP parser

蹲街弑〆低调 提交于 2019-12-11 03:24:12
问题 I want to use Stanford CoreNLP Parser to parse a sentence with the flag "makeCopulaHead" activated. In my file input.txt, I have the following sentence: I am tall. The objective is to not have a copula relation (cop) in the output dependency tree. I tried: java -cp "*" -mx8g edu.stanford.nlp.pipeline.StanfordCoreNLP -makeCopulaHead -file input.txt The .xml file contains cop relation :( I also tried (a bug with xml-output: https://mailman.stanford.edu/pipermail/java-nlp-user/2013-January