stanford-nlp

Extracting Arabic Proper Names from a text using Stanford-Parser

梦想的初衷 提交于 2019-12-12 23:06:05
问题 I am trying to extract Arabic proper names from a text using Stanford Parser. for example if I have an input sentence: تكريم سعد الدين الشاذلى using the Arabic Stanford parser, the tree diagram will be: (ROOT (NP (NN تكريم) (NP (NNP سعد) (DTNNP الدين) (NNP الشاذلى)))) I want to extract the proper name: سعد الدين الشاذلى which have the sub-tree: (NP (NNP سعد) (DTNNP الدين) (NNP الشاذلى)) I have tried this: similar question but there is some thing wrong in this line: List<TaggedWord>

Extracting Function Tags from Parsed Sentence (using Stanford Parser)

时光怂恿深爱的人放手 提交于 2019-12-12 19:38:51
问题 Looking at the Penn Treebank tagset (http://web.mit.edu/6.863/www/PennTreebankTags.html#RB) there is a section called "Function Tags" that would be extremely helpful for a project I am working on. I know the Stanford Parser uses the Penn Treebank tagset for its EnglishPCFG grammar so I am hoping there is support for Function Tags. Using the Stanford Parser and NLTK I have parsed sentences with Clause, Phrase and Word level tags as well as Universal Dependencies, but I have not found a way to

NER interfere with REGEXNER

时光总嘲笑我的痴心妄想 提交于 2019-12-12 17:08:53
问题 I use regexner to find named entities that are not in the default set of Stanford NLP and it works fine. However, when I add ner annotator, it annotates tokens that match my regular expression with default tags. How can I overwrite default annotations? def createNLPPipelineRegex(): StanfordCoreNLP = { val props = new Properties() props.put("regexner.mapping", "regex.txt") props.put("annotators", "tokenize, ssplit, regexner, pos, lemma, ner") props.put("tokenize.options", "untokenizable

Import Stanford nlp Intellij

久未见 提交于 2019-12-12 16:13:44
问题 I'm having trouble using Stanford Lemmatizer. As i'm using Intellij IDE, i try to import it via the Dependencies Windows, but i can't access all the class by that way. Is there a way to import stanford-english-corenlp-models-current.jar & stanford-corenlp-models-current.jar correctly on Intellij? 回答1: As guys mentioned above,you just import the wrong file First,download the CoreNLP 3.7.0(beta) In the screen shot above,click the red button to download the file,which covers all the things to

Manual tagging of Words using Stanford CorNLP

妖精的绣舞 提交于 2019-12-12 10:23:31
问题 I have a resource where i know exactly the types of words. i have to lemmatize them but for correct results, i have to manually tag them. i could not find any code for manual tagging of words. i m using following code but it returns wrong result. i.e "painting" for "painting" where i expect "paint". *//...........lemmatization starts........................ Properties props = new Properties(); props.put("annotators", "tokenize, ssplit, pos, lemma"); StanfordCoreNLP pipeline = new

Pharse level dependency parser using java,nlp

℡╲_俬逩灬. 提交于 2019-12-12 10:22:51
问题 Can someone please elaborate on how to obtain " pharse level dependency" using the Stanfords's Natural Language Processing Lexical Parser- open source Java code? http://svn.apache.org/repos/asf/nutch/branches/branch-1.2/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/RobotRulesParser.java http://docs.mongodb.org/manual/reference/sql-comparison/ such as pharse dependency The accident --------->happened falling ---------> as the night ---------->falling as such as many more...

Stanford Core NLP how to get the probability & margin of error

夙愿已清 提交于 2019-12-12 08:36:38
问题 When using the parser or for the matter any of the Annotation in Core NLP, is there a way to access the probability or the margin of error? To put my question into context, I am trying to understand if there is a way programmatically to detect a case of ambiguity. For instance in the sentence below the verb desire is detected as a noun. I would like to be able to know so kind of measure I can access or calculate from the Core NLP APi to tell me there could be an ambiguity. (NP (NP (NNP

How to use Stanford parser

一个人想着一个人 提交于 2019-12-12 07:57:08
问题 I downloaded the Stanford parser 2.0.5 and use Demo2.java source code that is in the package, but After I compile and run the program it has many errors. A part of my program is: public class testStanfordParser { /** Usage: ParserDemo2 [[grammar] textFile] */ public static void main(String[] args) throws IOException { String grammar = args.length > 0 ? args[0] : "edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz"; String[] options = { "-maxLength", "80", "-retainTmpSubcategories" };

How to generate custom training data for Stanford relation extraction

家住魔仙堡 提交于 2019-12-12 06:51:01
问题 I have trained a custom classifier to understand named entities in finance domain. I want to generate custom training data like shown in below link http://cogcomp.cs.illinois.edu/Data/ER/conll04.corp I can mark the custom relation by hand but want to generate the data format like conll first with my custom named entities. I have also tried the parser in the following way but that does not generate the relation training data like Roth and Yih's data mentioned in link https://nlp.stanford.edu

German corenlp model defaulting to english models

让人想犯罪 __ 提交于 2019-12-12 04:34:14
问题 I use the following command to serve a corenlp server for German language models which are downloaded as jar in the classpath , but it does not output german tags or parse but loads only english models: java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -props ./german.prop german.prop contents: annotators = tokenize, ssplit, pos, depparse, parse tokenize.language = de pos.model = edu/stanford/nlp/models/pos-tagger/german/german-hgc.tagger ner.model = edu/stanford/nlp/models