stanford-nlp | 易学教程

Stanford NER tagger generates 'file not found' exception with provided models

阅读更多关于 Stanford NER tagger generates 'file not found' exception with provided models

问题 I downloaded stanford NER 3.4.1, unpacked it, and tried to run named entity recognition on a local file using the default (provided) trained model. I got this: `java.io.FileNotFoundException: /u/nlp/data/pos_tags_are_useless/egw4-reut.512.clusters (No such file or directory) at edu.stanford.nlp.io.IOUtils.inputStreamFromFile(IOUtils.java:481)` What's wrong and how can I fix it? 回答1: It turns out that the provided models use "distributional similarity features" that require a .clusters file at

PTB treebank from CoNLL-X

阅读更多关于 PTB treebank from CoNLL-X

问题 I have a CoNLL-X format treebank and the corresponding binary parse tree for each sentence and I want to convert it into a PTB format. Is there any converters or can anyone shed light on the PTB format? 回答1: There's been a number of efforts to convert from dependencies (representable in CoNLL-X format) to constituents (representable in Penn Treebank, or PTB, format). Two recent papers and their code: Transforming Dependencies into Phrase Structures (Kong, Rush, and Smith, NAACL 2015). Code.

Error using Stanford POS Tagger in NLTK Python

阅读更多关于 Error using Stanford POS Tagger in NLTK Python

问题 I am trying to use Stanford POS Tagger in NLTK but I am not able to run the example code given here http://www.nltk.org/api/nltk.tag.html#module-nltk.tag.stanford import nltk from nltk.tag.stanford import POSTagger st = POSTagger(r'english-bidirectional-distim.tagger',r'D:/stanford-postagger/stanford-postagger.jar') st.tag('What is the airspeed of an unladen swallow?'.split()) I have already added environment variables as CLASSPATH = D:/stanford-postagger/stanford-postagger.jar STANFORD

Coreference resolution in python nltk using Stanford coreNLP

阅读更多关于 Coreference resolution in python nltk using Stanford coreNLP

问题 Stanford CoreNLP provides coreference resolution as mentioned here, also this thread, this, provides some insights about its implementation in Java. However, I am using python and NLTK and I am not sure how can I use Coreference resolution functionality of CoreNLP in my python code. I have been able to set up StanfordParser in NLTK, this is my code so far. from nltk.parse.stanford import StanfordDependencyParser stanford_parser_dir = 'stanford-parser/' eng_model_path = stanford_parser_dir +

Coreference resolution in python nltk using Stanford coreNLP

阅读更多关于 Coreference resolution in python nltk using Stanford coreNLP

How to train the Stanford NLP Sentiment Analysis tool

阅读更多关于 How to train the Stanford NLP Sentiment Analysis tool

问题 Hell everyone! I'm using the Stanford Core NLP package and my goal is to perform sentiment analysis on a live-stream of tweets. Using the sentiment analysis tool as is returns a very poor analysis of text's 'attitude' .. many positives are labeled neutral, many negatives rated positive. I've gone ahead an acquired well over a million tweets in a text file, but I haven't a clue how to actually train the tool and create my own model. Link to Stanford Sentiment Analysis page "Models can be

How to pick out a the subject, predicate, and object and adjectives in a sentence

阅读更多关于 How to pick out a the subject, predicate, and object and adjectives in a sentence

问题 I want to extract the subject, predicate, and object of a sentence and find out which adjectives go to the subject, predicate, or object with Stanford CoreNLP in java code. I have tried to use the dependency parser to solve this by finding the dependency index, checking the dependency tag if it equals amod, then adding it to an ArrayList, but with this method sometimes the adjective's dependency tag is not amod and is nmod, and other tags may come up. With determining the object and predicate

How to pick out a the subject, predicate, and object and adjectives in a sentence

阅读更多关于 How to pick out a the subject, predicate, and object and adjectives in a sentence

UTF-8 issue with CoreNLP server

阅读更多关于 UTF-8 issue with CoreNLP server

问题 I run a Stanford CoreNLP Server with the following command: java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer I try to parse the sentence Who was Darth Vader’s son? . Note that the apostrophe behind Vader is not an ASCII character. The online demo successfully parse the sentence: The server I run on localhost fails: I also tried to perform the query using Python. import requests url = 'http://localhost:9000/' sentence = 'Who was Darth Vader’s son?' r=requests.post(url, params

xml format in stanford pos tagger

阅读更多关于 xml format in stanford pos tagger

问题 i have tagged 20 sentences and this is my code: public class myTag { public static void main(String[] args) { Properties props = new Properties(); try { props.load(new FileReader("D:/tagger/english-bidirectional-distsim.tagger.props")); } catch (FileNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } MaxentTagger tagger = new MaxentTagger("D:/tagger/english-bidirectional-distsim