stanford-nlp | 易学教程

How to get NN andNNS from a text?

阅读更多关于 How to get NN andNNS from a text?

问题 I want to get NN or NNS from a sample text as given within the script below. To this end, when I use the code below, the output is: types synchronization phase synchronization -RSB- synchronization -LSB- -RSB- projection synchronization Here why am I getting [-RSB-] or [-LSB-] ? Should I use a different pattern to get NN or NNS at the same time? atic = "So far, many different types of synchronization have been investigated, such as complete synchronization [8], generalized synchronization [9]

Training Stanford-NER-CRF, control number of iterations and regularisation (L1,L2) parameters

阅读更多关于 Training Stanford-NER-CRF, control number of iterations and regularisation (L1,L2) parameters

问题 I was looking through StanfordNER documentation/FAQ but I can't find anything related to specifying the maximum number of iterations in training and also the value of the regularisation parameters L1 and L2. I saw an answer on which is suggested to set, for instance: maxIterations=10 in the properties file, but that did not gave any results. Is it possible to set these parameters? 回答1: I had to dig in the code but found it, so basically StanfordNER supports many different numerical

How to feed CoreNLP some pre-labeled Named Entities?

阅读更多关于 How to feed CoreNLP some pre-labeled Named Entities?

问题 I want to use Standford CoreNLP to pull out Coreferences and start working on the Dependencies of pre-labeled text. I eventually hope to build graph nodes and edges between related Named Entities. I am working in python, but using nltk's java functions to call the "edu.stanford.nlp.pipeline.StanfordCoreNLP" jar directly (which is what nltk does behind the scenes anyway). My pre-labeled text is in this format: PRE-LABELED: During his youth, [PERSON: Alexander III of Macedon] was tutored by

How to suppress Stanford CoreNLP Redwood logging in MATLAB?

阅读更多关于 How to suppress Stanford CoreNLP Redwood logging in MATLAB?

问题 The thread How to shutdown Stanford CoreNLP Redwood logging? supposedly resolved my question in Java. I would like to do the same in MATLAB, but the code(s) given in that thread doesn't work. Please suggest a complete solution, starting with what to import, setting properties, etc. My code is the following: import java.io.*; import edu.stanford.nlp.tagger.maxent.MaxentTagger; tagger = MaxentTagger('./english-left3words-distsim.tagger'); which logs on the command line: Reading POS tagger model

Inconsistent results of POS tagging between core nlp demo and parser demo

阅读更多关于 Inconsistent results of POS tagging between core nlp demo and parser demo

问题 Inconsistent results of POS tagging between P: http://nlp.stanford.edu:8080/parser/ and C: http://nlp.stanford.edu:8080/corenlp/process E.g., C: We went east/JJ to Oslo. P: We went east/RB to Oslo. C: We are all/DT getting older. P: We are all/RB getting older. C: Are you getting excited/VBN about your vacation? P: Are you getting excited/JJ about your vacation? C: Did you do/VBP that? P: Did you do/VB that? It seems that the parser performs better than core nlp, but I cannot replicate the

Stanford NER Error: Loading distsim lexicon Failed

阅读更多关于 Stanford NER Error: Loading distsim lexicon Failed

问题 In my project. I need to use NER annotation so I used NERDemo.java It works fine when I create a new project and have only this code, but when I add it to my project I keep getting errors. I have edited the path in my code to the specific location of the classifiers. I added the Jar files: This is the code: String serializedClassifier = "/Users/ha/stanford-ner-2014-10-26/classifiers/english.all.3class.distsim.crf.ser.gz"; String serializedClassifier2 = "/Users/ha/stanford-ner-2014-10-26

List of part of speech tags per sentence with POS Tagger Stanford NPL in C#

阅读更多关于 List of part of speech tags per sentence with POS Tagger Stanford NPL in C#

问题 Using the POS Tagger of Stanford NPL .NET, I'm trying to extract a detailed list of part of speech tags per sentence. e.g: "Have a look over there. Look at the car!" Have/VB a/DT look/NN over/IN there/RB ./. Look/VB at/IN the/DT car/NN !/. I need: POS Text: "Have" POS tag: "VB" Position in the original text I managed to achieve this by accessing the private fields of the result via reflection. I know it's ugly, not efficient and very bad, but that's the only I found until know. Hence my

Processing input before giving input to parser

阅读更多关于 Processing input before giving input to parser

问题 What kind of processing should be done to the input which is given to the parser. As of know i am using stanford parser.jar but there is also stanford coreNLP.jar what is the difference between parser.jar and coreNLP.jar parsing method As per coreNLP documentation you can pass the operation you want to do as input in the annotators COMMAND: java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,parse,dcoref -file input.txt To use parsing in

Stanford NNDep parser: java.lang.ArrayIndexOutOfBoundsException

阅读更多关于 Stanford NNDep parser: java.lang.ArrayIndexOutOfBoundsException

问题 After training a model, i’m trying to parse the test treebank. Unfortunately, this error keeps popping up: Loading depparse model file: nndep.model.txt.gz ... ################### #Transitions: 77 #Labels: 38 ROOTLABEL: root Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 25 at edu.stanford.nlp.parser.nndep.Classifier.preCompute(Classifier.java:663) at edu.stanford.nlp.parser.nndep.Classifier.preCompute(Classifier.java:637) at edu.stanford.nlp.parser.nndep.DependencyParser

Stanford NLP ColumnDataClassifier: How to serialize model with only top features?

阅读更多关于 Stanford NLP ColumnDataClassifier: How to serialize model with only top features?

问题 When training a model there is an option called limitFeatures . When I set this feature, say 100 ColumnDataClassifier uses just the top 100 features. However it still serializes all the features to the model.ser.gz . When I deserialize this file in my Java code, my program uses approx. 500M memory. Is there a way to create smaller models with just selected features? I am using the tool from CLI. But any solution with Java is very welcome as well. Here are the relevant code from the prop file: