stanford-nlp | 易学教程

Triple extraction from a sentance

阅读更多关于 Triple extraction from a sentance

I have this parsed text in this format, I got it by using Standford nlp. (ROOT (S (NP (DT A) (NN passenger) (NN plane)) (VP (VBZ has) (VP (VBD crashed) (ADVP (RB shortly)) (PP (IN after) (NP (NP (NN take-off)) (PP (IN from) (NP (NNP Kyrgyzstan) (`` `) (NNP scapital) (, ,) (NNP Bishkek))))) (, ,) (VP (VBG killing) (NP (NP (DT a) (JJ large) (NN number)) (PP (IN of) (NP (NP (DT those)) (PP (IN on) (NP (NN board))))))))) (. .))) det(plane-3, A-1) nn(plane-3, passenger-2) nsubj(crashed-5, plane-3) aux(crashed-5, has-4) root(ROOT-0, crashed-5) advmod(crashed-5, shortly-6) prep_after(crashed-5, take

Resolve coreference using Stanford CoreNLP - unable to load parser model

阅读更多关于 Resolve coreference using Stanford CoreNLP - unable to load parser model

问题 I want to do a very simple job: given a string containing pronouns, I want to resolve them. for example, I want to turn the sentence "Mary has a little lamb. She is cute." in "Mary has a little lamb. Mary is cute.". I have tried to use Stanford CoreNLP. However, I seem unable to get the parser to start. I have imported all the included jars in my project using Eclipse, and I have allocated 3GB to the JVM (-Xmx3g). The error is very awkward: Exception in thread "main" java.lang

How to recognize a named entity that is lowcase such as kobe bryant by CoreNLP?

阅读更多关于 How to recognize a named entity that is lowcase such as kobe bryant by CoreNLP?

I got a problem that CoreNLP can only recognize named entity such as Kobe Bryant that is beginning with a uppercase char, but can't recognize kobe bryant as a person!!! So how to recognize a named entity that is beginning with a lowercase char by CoreNLP ???? Appreciate it !!!! First off, you do have to accept that it is harder to get named entities right in lowercase or inconsistently cased English text than in formal text, where capital letters are a great clue. (This is also one reason why Chinese NER is harder than English NER.) Nevertheless, there are things that you must do to get

OpenNLP vs Stanford CoreNLP

阅读更多关于 OpenNLP vs Stanford CoreNLP

问题 I've been doing a little comparison of these two packages and am not sure which direction to go in. What I am looking for briefly is: Named Entity Recognition (people, places, organizations and such). Gender identification. A decent training API. From what I can tell, OpenNLP and Stanford CoreNLP expose pretty similar capabilities. However, Stanford CoreNLP looks like it has a lot more activity whereas OpenNLP has only had a few commits in the last six months. Based on what I saw, OpenNLP

How to create incremental NER training model(Appending in existing model)?

阅读更多关于 How to create incremental NER training model(Appending in existing model)?

I am training customized Named Entity Recognition(NER) model using stanford NLP but the thing is i want to re-train the model . Example : Suppose i trained xyz model , then i will test it on some text if model detected somethings wrong then i (end user) will correct it and wanna re-train(append mode) the model on the corrected text. Stanford Doesn't provide re-training facility so thats why i shifted towards spacy library of python , where i can retrain the model means , i can append new entities into the existing model.But after re-training the model using spacy , it overriding the existing

Issue using Stanford CoreNLP parsing models

阅读更多关于 Issue using Stanford CoreNLP parsing models

I cannot find the Stanford parsing models for German and French: there is no "germanPCFG.ser.gz" or "frenchFactored.ser.gz" in the jar (stanford-corenlp-3.2.0-models.jar) - only english. Have searched through posttagger jar too. Same issue encountered at : How to use Stanford CoreNLP with a Non-English parse model? You can find them in the download for the Stanford Parser . Look in the models.jar file. With Maven you can use <dependency> <groupId>edu.stanford.nlp</groupId> <artifactId>stanford-corenlp</artifactId> <version>3.5.2</version> </dependency> <dependency> <groupId>edu.stanford.nlp<

How to make a tree from the output of a dependency parser?

阅读更多关于 How to make a tree from the output of a dependency parser?

I am trying to make a tree (nested dictionary) from the output of dependency parser. The sentence is "I shot an elephant in my sleep". I am able to get the output as described on the link: How do I do dependency parsing in NLTK? nsubj(shot-2, I-1) det(elephant-4, an-3) dobj(shot-2, elephant-4) prep(shot-2, in-5) poss(sleep-7, my-6) pobj(in-5, sleep-7) To convert this list of tuples into nested dictionary, I used the following link: How to convert python list of tuples into tree? def build_tree(list_of_tuples): all_nodes = {n[2]:((n[0], n[1]),{}) for n in list_of_tuples} root = {} print all

Neural Network Stanford parser word2vector format error during training

阅读更多关于 Neural Network Stanford parser word2vector format error during training

I am trying to train a model with Stanford neural network dependency parser for English. It does not accept a standard word2vector file with 100 dimensions. It generates an error message. I am using the embedded words as defined in this Web page: [ https://drive.google.com/file/d/0B8nESzOdPhLsdWF2S1Ayb1RkTXc/view?usp=sharing][1] I have dowloaded the data as a text file in myPC. I am using the parameter -embeddingSize 100 but the parser generates an error message: Embedding File /../.../sskip.100.vectors: #Words=243004, dim=1 The dimension of embedding file does not match config.embeddingSize I

How can I detect named entities that have more than 1 word using CoreNLP's RegexNER?

阅读更多关于 How can I detect named entities that have more than 1 word using CoreNLP's RegexNER?

I am using the RegexNER annotator in CoreNLP and some of my named entities consist of multiple words. Excerpt from my mapping file: RAF inhibitor DRUG_CLASS Gilbert's syndrome DISEASE The first one gets detected but each word gets the annotation DRUG_CLASS and there seems to be no way to link the words, like an NER id which both words would have. The second case does not get detected at all and that's probably because the tokenizer treats the apostrophe after Gilbert as a separate token. Since RegexNER has the tokenization as a dependency, I can't really get around it. Any suggestions to

Stanford NER: Can I use two classifiers at once in my code?

阅读更多关于 Stanford NER: Can I use two classifiers at once in my code?

问题 In my code, I get the Person recognition from the first classifier, and for the second one which I made, I added some words to be recognized or annotated as Organization but it does not annotate Person . I need to get the benefit from the two of them, how can I do that? I'm using Netbeans, and this is the code: String serializedClassifier = "classifiers/english.all.3class.distsim.crf.ser.gz"; String serializedClassifier2 = "/Users/ha/stanford-ner-2014-10-26/classifiers/dept-model.ser.gz"; if