stanford-nlp

Annotate author names using REGEXNER from the stanfordnlp library

泄露秘密 提交于 2020-05-14 08:42:06
问题 My goal is to annotate author names from scientific articles with the entity PERSON. I am particularly interested with the names that match this format (authorname et al. date). For example I would like for this sentence (Minot et al. 2000 ) => to annotate Minot as a PERSON. I am using an adapted version of the code found in the official page of stanford nlp team: import stanfordnlp from stanfordnlp.server import CoreNLPClient # example text print('---') print('input text') print('') text =

Stanford CoreNLP TokensRegex / Error while parsing the .rules file in Python

大城市里の小女人 提交于 2020-04-17 23:46:31
问题 I am trying to solve this problem in this link, but using regexner from the stanford nlp library was not possible. (NB: I am using stanfordnlp library version 0.2.0, Stanford CoreNLP version 3.9.2, and Python 3.7.3) So I wanted to try a solution using TokenRegex. As a first attempt I tried to use the token regex file tokenrgxrules.rules from this solution: ner = { type: "CLASS", value: "edu.stanford.nlp.ling.CoreAnnotations$NamedEntityTagAnnotation" } $ORGANIZATION_TITLES = "/inc\.|corp\./"

how to run stanford corenlp server on google colab?

女生的网名这么多〃 提交于 2020-02-27 13:52:52
问题 I want to use stanford corenlp for obtaining dependency parser of sentences. In order to using stanford corenlp in python, we need to do the below steps: Install java Download stanford-corenlp-full-2018-10-05 and extract it. Change directory to stanford-corenlp-full-2018-10-05 folder with "cd" command. Run this command in the current directory: "java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 75000" . After that, stanford-corenlp server will run at 'http

how to run stanford corenlp server on google colab?

江枫思渺然 提交于 2020-02-27 13:52:52
问题 I want to use stanford corenlp for obtaining dependency parser of sentences. In order to using stanford corenlp in python, we need to do the below steps: Install java Download stanford-corenlp-full-2018-10-05 and extract it. Change directory to stanford-corenlp-full-2018-10-05 folder with "cd" command. Run this command in the current directory: "java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 75000" . After that, stanford-corenlp server will run at 'http

extract all noun phrases from stanford parser output textfile using bash

大兔子大兔子 提交于 2020-02-04 05:41:28
问题 As a result of running Stanford Parser, I have output files contain a Penn Treebank structured format. Each file contains the following. (ROOT (S (S (NP (NP (DT The) (JJS strongest) (NN rain)) (VP (ADVP (RB ever)) (VBN recorded) (PP (IN in) (NP (NNP India))))) (VP (VP (VBD shut) (PRT (RP down)) (NP (NP (DT the) (JJ financial) (NN hub)) (PP (IN of) (NP (NNP Mumbai))))) (, ,) (VP (VBD snapped) (NP (NN communication) (NNS lines))) (, ,) (VP (VBD closed) (NP (NNS airports))) (CC and) (VP (VBD

Stanford segmenter nltk Could not find SLF4J in your classpath

假装没事ソ 提交于 2020-01-26 03:14:41
问题 I've set up a nltk and stanford environment, and nltk and stanford jars has downloaded, the program with nltk was ok, but I had a trouble with stanford segmenter. just make a simple program via stanford segmenter, I got a error is Could not find SLF4J in your classpath, although I had exported all jars including slf4j-api.jar . Detail as follows Python3.5 NLTK 3.2.2 Standford jars 3.7 OS: Centos environment variable: export JAVA_HOME=/usr/java/jdk1.8.0_60 export NLTK_DATA=/opt/nltk_data

Using other language models with Stanford Core NLP

*爱你&永不变心* 提交于 2020-01-25 00:57:13
问题 I want to use a Dutch model for named entity recognition with Core NLP. I have found a pre-trained model from OpenNLP, but it doesn't seem to be interoperable with CoreNLP. Why is that? Can we still use Core NLP with other languages than English, Chinese and Spanish? 回答1: CoreNLP currently does not support Dutch. There are some components which work for German and Arabic, but the pipeline is currently only for English, Chinese and Spanish. You can retrain our NER model on the same conllx data

Stanford classifier cross validation averaged or aggregate metrics

天大地大妈咪最大 提交于 2020-01-23 17:21:12
问题 With Stanford Classifier it is possible to use cross validation by setting the options in the properties file, such as this for 10-fold cross validation: crossValidationFolds=10 printCrossValidationDecisions=true shuffleTrainingData=true shuffleSeed=1 Running this will output, per fold, the various metrics, such as precision, recall, Accuracy/micro-averaged F1 and Macro-averaged F1. Is there an option to get an averaged or otherwise aggregated score of all 10 Accuracy/micro-averaged F1 or all

OpenNLP: Training a custom NER Model for multiple entities

妖精的绣舞 提交于 2020-01-23 10:59:34
问题 I am trying training a custom NER model for multiple entities. Here is the sample training data: count all <START:item_type> operating tables <END> on the <START:location_id> third <END> <START:location_type> floor <END> count all <START:item_type> items <END> on the <START:location_id> third <END> <START:location_type> floor <END> how many <START:item_type> beds <END> are in <START:location_type> room <END> <START:location_id> 2 <END> The NameFinderME.train(.) method takes a string parameter

OpenNLP: Training a custom NER Model for multiple entities

泄露秘密 提交于 2020-01-23 10:59:22
问题 I am trying training a custom NER model for multiple entities. Here is the sample training data: count all <START:item_type> operating tables <END> on the <START:location_id> third <END> <START:location_type> floor <END> count all <START:item_type> items <END> on the <START:location_id> third <END> <START:location_type> floor <END> how many <START:item_type> beds <END> are in <START:location_type> room <END> <START:location_id> 2 <END> The NameFinderME.train(.) method takes a string parameter