stanford-nlp | 易学教程

Annotate author names using REGEXNER from the stanfordnlp library

阅读更多关于 Annotate author names using REGEXNER from the stanfordnlp library

问题 My goal is to annotate author names from scientific articles with the entity PERSON. I am particularly interested with the names that match this format (authorname et al. date). For example I would like for this sentence (Minot et al. 2000 ) => to annotate Minot as a PERSON. I am using an adapted version of the code found in the official page of stanford nlp team: import stanfordnlp from stanfordnlp.server import CoreNLPClient # example text print('---') print('input text') print('') text =

Stanford CoreNLP TokensRegex / Error while parsing the .rules file in Python

阅读更多关于 Stanford CoreNLP TokensRegex / Error while parsing the .rules file in Python

问题 I am trying to solve this problem in this link, but using regexner from the stanford nlp library was not possible. (NB: I am using stanfordnlp library version 0.2.0, Stanford CoreNLP version 3.9.2, and Python 3.7.3) So I wanted to try a solution using TokenRegex. As a first attempt I tried to use the token regex file tokenrgxrules.rules from this solution: ner = { type: "CLASS", value: "edu.stanford.nlp.ling.CoreAnnotations$NamedEntityTagAnnotation" } $ORGANIZATION_TITLES = "/inc\.|corp\./"

how to run stanford corenlp server on google colab?

阅读更多关于 how to run stanford corenlp server on google colab?

问题 I want to use stanford corenlp for obtaining dependency parser of sentences. In order to using stanford corenlp in python, we need to do the below steps: Install java Download stanford-corenlp-full-2018-10-05 and extract it. Change directory to stanford-corenlp-full-2018-10-05 folder with "cd" command. Run this command in the current directory: "java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 75000" . After that, stanford-corenlp server will run at 'http

how to run stanford corenlp server on google colab?

阅读更多关于 how to run stanford corenlp server on google colab?

extract all noun phrases from stanford parser output textfile using bash

阅读更多关于 extract all noun phrases from stanford parser output textfile using bash

问题 As a result of running Stanford Parser, I have output files contain a Penn Treebank structured format. Each file contains the following. (ROOT (S (S (NP (NP (DT The) (JJS strongest) (NN rain)) (VP (ADVP (RB ever)) (VBN recorded) (PP (IN in) (NP (NNP India))))) (VP (VP (VBD shut) (PRT (RP down)) (NP (NP (DT the) (JJ financial) (NN hub)) (PP (IN of) (NP (NNP Mumbai))))) (, ,) (VP (VBD snapped) (NP (NN communication) (NNS lines))) (, ,) (VP (VBD closed) (NP (NNS airports))) (CC and) (VP (VBD

Stanford segmenter nltk Could not find SLF4J in your classpath

阅读更多关于 Stanford segmenter nltk Could not find SLF4J in your classpath

问题 I've set up a nltk and stanford environment, and nltk and stanford jars has downloaded, the program with nltk was ok, but I had a trouble with stanford segmenter. just make a simple program via stanford segmenter, I got a error is Could not find SLF4J in your classpath, although I had exported all jars including slf4j-api.jar . Detail as follows Python3.5 NLTK 3.2.2 Standford jars 3.7 OS: Centos environment variable: export JAVA_HOME=/usr/java/jdk1.8.0_60 export NLTK_DATA=/opt/nltk_data

Using other language models with Stanford Core NLP

阅读更多关于 Using other language models with Stanford Core NLP

问题 I want to use a Dutch model for named entity recognition with Core NLP. I have found a pre-trained model from OpenNLP, but it doesn't seem to be interoperable with CoreNLP. Why is that? Can we still use Core NLP with other languages than English, Chinese and Spanish? 回答1: CoreNLP currently does not support Dutch. There are some components which work for German and Arabic, but the pipeline is currently only for English, Chinese and Spanish. You can retrain our NER model on the same conllx data

Stanford classifier cross validation averaged or aggregate metrics

阅读更多关于 Stanford classifier cross validation averaged or aggregate metrics

问题 With Stanford Classifier it is possible to use cross validation by setting the options in the properties file, such as this for 10-fold cross validation: crossValidationFolds=10 printCrossValidationDecisions=true shuffleTrainingData=true shuffleSeed=1 Running this will output, per fold, the various metrics, such as precision, recall, Accuracy/micro-averaged F1 and Macro-averaged F1. Is there an option to get an averaged or otherwise aggregated score of all 10 Accuracy/micro-averaged F1 or all

OpenNLP: Training a custom NER Model for multiple entities

阅读更多关于 OpenNLP: Training a custom NER Model for multiple entities

问题 I am trying training a custom NER model for multiple entities. Here is the sample training data: count all <START:item_type> operating tables <END> on the <START:location_id> third <END> <START:location_type> floor <END> count all <START:item_type> items <END> on the <START:location_id> third <END> <START:location_type> floor <END> how many <START:item_type> beds <END> are in <START:location_type> room <END> <START:location_id> 2 <END> The NameFinderME.train(.) method takes a string parameter

OpenNLP: Training a custom NER Model for multiple entities

阅读更多关于 OpenNLP: Training a custom NER Model for multiple entities