stanford-nlp

How to get NN andNNS from a text?

删除回忆录丶 提交于 2019-12-25 00:42:57
问题 I want to get NN or NNS from a sample text as given within the script below. To this end, when I use the code below, the output is: types synchronization phase synchronization -RSB- synchronization -LSB- -RSB- projection synchronization Here why am I getting [-RSB-] or [-LSB-] ? Should I use a different pattern to get NN or NNS at the same time? atic = "So far, many different types of synchronization have been investigated, such as complete synchronization [8], generalized synchronization [9]

Training Stanford-NER-CRF, control number of iterations and regularisation (L1,L2) parameters

China☆狼群 提交于 2019-12-24 23:15:53
问题 I was looking through StanfordNER documentation/FAQ but I can't find anything related to specifying the maximum number of iterations in training and also the value of the regularisation parameters L1 and L2. I saw an answer on which is suggested to set, for instance: maxIterations=10 in the properties file, but that did not gave any results. Is it possible to set these parameters? 回答1: I had to dig in the code but found it, so basically StanfordNER supports many different numerical

How to feed CoreNLP some pre-labeled Named Entities?

假装没事ソ 提交于 2019-12-24 20:14:19
问题 I want to use Standford CoreNLP to pull out Coreferences and start working on the Dependencies of pre-labeled text. I eventually hope to build graph nodes and edges between related Named Entities. I am working in python, but using nltk's java functions to call the "edu.stanford.nlp.pipeline.StanfordCoreNLP" jar directly (which is what nltk does behind the scenes anyway). My pre-labeled text is in this format: PRE-LABELED: During his youth, [PERSON: Alexander III of Macedon] was tutored by

How to suppress Stanford CoreNLP Redwood logging in MATLAB?

大城市里の小女人 提交于 2019-12-24 18:45:55
问题 The thread How to shutdown Stanford CoreNLP Redwood logging? supposedly resolved my question in Java. I would like to do the same in MATLAB, but the code(s) given in that thread doesn't work. Please suggest a complete solution, starting with what to import, setting properties, etc. My code is the following: import java.io.*; import edu.stanford.nlp.tagger.maxent.MaxentTagger; tagger = MaxentTagger('./english-left3words-distsim.tagger'); which logs on the command line: Reading POS tagger model

Inconsistent results of POS tagging between core nlp demo and parser demo

感情迁移 提交于 2019-12-24 18:37:52
问题 Inconsistent results of POS tagging between P: http://nlp.stanford.edu:8080/parser/ and C: http://nlp.stanford.edu:8080/corenlp/process E.g., C: We went east/JJ to Oslo. P: We went east/RB to Oslo. C: We are all/DT getting older. P: We are all/RB getting older. C: Are you getting excited/VBN about your vacation? P: Are you getting excited/JJ about your vacation? C: Did you do/VBP that? P: Did you do/VB that? It seems that the parser performs better than core nlp, but I cannot replicate the

Stanford NER Error: Loading distsim lexicon Failed

我与影子孤独终老i 提交于 2019-12-24 18:12:57
问题 In my project. I need to use NER annotation so I used NERDemo.java It works fine when I create a new project and have only this code, but when I add it to my project I keep getting errors. I have edited the path in my code to the specific location of the classifiers. I added the Jar files: This is the code: String serializedClassifier = "/Users/ha/stanford-ner-2014-10-26/classifiers/english.all.3class.distsim.crf.ser.gz"; String serializedClassifier2 = "/Users/ha/stanford-ner-2014-10-26

List of part of speech tags per sentence with POS Tagger Stanford NPL in C#

坚强是说给别人听的谎言 提交于 2019-12-24 15:08:21
问题 Using the POS Tagger of Stanford NPL .NET, I'm trying to extract a detailed list of part of speech tags per sentence. e.g: "Have a look over there. Look at the car!" Have/VB a/DT look/NN over/IN there/RB ./. Look/VB at/IN the/DT car/NN !/. I need: POS Text: "Have" POS tag: "VB" Position in the original text I managed to achieve this by accessing the private fields of the result via reflection. I know it's ugly, not efficient and very bad, but that's the only I found until know. Hence my

Processing input before giving input to parser

天涯浪子 提交于 2019-12-24 14:26:26
问题 What kind of processing should be done to the input which is given to the parser. As of know i am using stanford parser.jar but there is also stanford coreNLP.jar what is the difference between parser.jar and coreNLP.jar parsing method As per coreNLP documentation you can pass the operation you want to do as input in the annotators COMMAND: java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,parse,dcoref -file input.txt To use parsing in

Stanford NNDep parser: java.lang.ArrayIndexOutOfBoundsException

强颜欢笑 提交于 2019-12-24 14:11:04
问题 After training a model, i’m trying to parse the test treebank. Unfortunately, this error keeps popping up: Loading depparse model file: nndep.model.txt.gz ... ################### #Transitions: 77 #Labels: 38 ROOTLABEL: root Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 25 at edu.stanford.nlp.parser.nndep.Classifier.preCompute(Classifier.java:663) at edu.stanford.nlp.parser.nndep.Classifier.preCompute(Classifier.java:637) at edu.stanford.nlp.parser.nndep.DependencyParser

Stanford NLP ColumnDataClassifier: How to serialize model with only top features?

早过忘川 提交于 2019-12-24 06:47:26
问题 When training a model there is an option called limitFeatures . When I set this feature, say 100 ColumnDataClassifier uses just the top 100 features. However it still serializes all the features to the model.ser.gz . When I deserialize this file in my Java code, my program uses approx. 500M memory. Is there a way to create smaller models with just selected features? I am using the tool from CLI. But any solution with Java is very welcome as well. Here are the relevant code from the prop file: