stanford-nlp | 易学教程

Extracting Arabic Proper Names from a text using Stanford-Parser

阅读更多关于 Extracting Arabic Proper Names from a text using Stanford-Parser

问题 I am trying to extract Arabic proper names from a text using Stanford Parser. for example if I have an input sentence: تكريم سعد الدين الشاذلى using the Arabic Stanford parser, the tree diagram will be: (ROOT (NP (NN تكريم) (NP (NNP سعد) (DTNNP الدين) (NNP الشاذلى)))) I want to extract the proper name: سعد الدين الشاذلى which have the sub-tree: (NP (NNP سعد) (DTNNP الدين) (NNP الشاذلى)) I have tried this: similar question but there is some thing wrong in this line: List<TaggedWord>

Extracting Function Tags from Parsed Sentence (using Stanford Parser)

阅读更多关于 Extracting Function Tags from Parsed Sentence (using Stanford Parser)

问题 Looking at the Penn Treebank tagset (http://web.mit.edu/6.863/www/PennTreebankTags.html#RB) there is a section called "Function Tags" that would be extremely helpful for a project I am working on. I know the Stanford Parser uses the Penn Treebank tagset for its EnglishPCFG grammar so I am hoping there is support for Function Tags. Using the Stanford Parser and NLTK I have parsed sentences with Clause, Phrase and Word level tags as well as Universal Dependencies, but I have not found a way to

NER interfere with REGEXNER

阅读更多关于 NER interfere with REGEXNER

问题 I use regexner to find named entities that are not in the default set of Stanford NLP and it works fine. However, when I add ner annotator, it annotates tokens that match my regular expression with default tags. How can I overwrite default annotations? def createNLPPipelineRegex(): StanfordCoreNLP = { val props = new Properties() props.put("regexner.mapping", "regex.txt") props.put("annotators", "tokenize, ssplit, regexner, pos, lemma, ner") props.put("tokenize.options", "untokenizable

Import Stanford nlp Intellij

阅读更多关于 Import Stanford nlp Intellij

问题 I'm having trouble using Stanford Lemmatizer. As i'm using Intellij IDE, i try to import it via the Dependencies Windows, but i can't access all the class by that way. Is there a way to import stanford-english-corenlp-models-current.jar & stanford-corenlp-models-current.jar correctly on Intellij? 回答1: As guys mentioned above,you just import the wrong file First,download the CoreNLP 3.7.0(beta) In the screen shot above,click the red button to download the file,which covers all the things to

Manual tagging of Words using Stanford CorNLP

阅读更多关于 Manual tagging of Words using Stanford CorNLP

问题 I have a resource where i know exactly the types of words. i have to lemmatize them but for correct results, i have to manually tag them. i could not find any code for manual tagging of words. i m using following code but it returns wrong result. i.e "painting" for "painting" where i expect "paint". *//...........lemmatization starts........................ Properties props = new Properties(); props.put("annotators", "tokenize, ssplit, pos, lemma"); StanfordCoreNLP pipeline = new

Pharse level dependency parser using java,nlp

阅读更多关于 Pharse level dependency parser using java,nlp

问题 Can someone please elaborate on how to obtain " pharse level dependency" using the Stanfords's Natural Language Processing Lexical Parser- open source Java code? http://svn.apache.org/repos/asf/nutch/branches/branch-1.2/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/RobotRulesParser.java http://docs.mongodb.org/manual/reference/sql-comparison/ such as pharse dependency The accident --------->happened falling ---------> as the night ---------->falling as such as many more...

Stanford Core NLP how to get the probability & margin of error

阅读更多关于 Stanford Core NLP how to get the probability & margin of error

问题 When using the parser or for the matter any of the Annotation in Core NLP, is there a way to access the probability or the margin of error? To put my question into context, I am trying to understand if there is a way programmatically to detect a case of ambiguity. For instance in the sentence below the verb desire is detected as a noun. I would like to be able to know so kind of measure I can access or calculate from the Core NLP APi to tell me there could be an ambiguity. (NP (NP (NNP

How to use Stanford parser

阅读更多关于 How to use Stanford parser

问题 I downloaded the Stanford parser 2.0.5 and use Demo2.java source code that is in the package, but After I compile and run the program it has many errors. A part of my program is: public class testStanfordParser { /** Usage: ParserDemo2 [[grammar] textFile] */ public static void main(String[] args) throws IOException { String grammar = args.length > 0 ? args[0] : "edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz"; String[] options = { "-maxLength", "80", "-retainTmpSubcategories" };

How to generate custom training data for Stanford relation extraction

阅读更多关于 How to generate custom training data for Stanford relation extraction

问题 I have trained a custom classifier to understand named entities in finance domain. I want to generate custom training data like shown in below link http://cogcomp.cs.illinois.edu/Data/ER/conll04.corp I can mark the custom relation by hand but want to generate the data format like conll first with my custom named entities. I have also tried the parser in the following way but that does not generate the relation training data like Roth and Yih's data mentioned in link https://nlp.stanford.edu

German corenlp model defaulting to english models

阅读更多关于 German corenlp model defaulting to english models

问题 I use the following command to serve a corenlp server for German language models which are downloaded as jar in the classpath , but it does not output german tags or parse but loads only english models: java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -props ./german.prop german.prop contents: annotators = tokenize, ssplit, pos, depparse, parse tokenize.language = de pos.model = edu/stanford/nlp/models/pos-tagger/german/german-hgc.tagger ner.model = edu/stanford/nlp/models