stanford-nlp

Reusable version of DKPro Core pipeline

痴心易碎 提交于 2020-01-03 00:27:15
问题 I have set up DKPro Core as a web service to take an input and provide a tokenised output. The service itself is set up as a Jersey resource: @Path("/") public class MyResource { public MyResource() { // Nothing here } @GET public String generate(@QueryParam("q") final String input) { try { final JCasIterable en = iteratePipeline( createReaderDescription(StringReader.class, StringReader.PARAM_DOCUMENT_TEXT, input, StringReader.PARAM_LANGUAGE, "en") ,createEngineDescription(StanfordSegmenter

Stanford CoreNLP: Use partial existing annotation

烈酒焚心 提交于 2020-01-02 15:25:14
问题 We are trying to use existing tokenzation sentence splitting and named entity tagging while we would like to use Stanford CoreNlp to additionally provide us with part-of-speech tagging lemmatization and parsing Currently, we are trying it the following way: 1) make an annotator for "pos, lemma, parse" Properties pipelineProps = new Properties(); pipelineProps.put("annotators", "pos, lemma, parse"); pipelineProps.setProperty("parse.maxlen", "80"); pipelineProps.setProperty("pos.maxlen", "80");

How to use Stanford LexParser for Chinese text?

白昼怎懂夜的黑 提交于 2020-01-02 09:16:22
问题 I can't seem to get the correct input encoding for Stanford NLP's LexParser. How do I use the Stanford LexParser for Chinese text? I've done the following to download the tool: $ wget http://nlp.stanford.edu/software/stanford-parser-full-2015-04-20.zip $ unzip stanford-parser-full-2015-04-20.zip $ cd stanford-parser-full-2015-04-20/ And my input text is in UTF-8 : $ echo "应有尽有 的 丰富 选择 定 将 为 您 的 旅程 增添 无数 的 赏心 乐事 。" > input.txt $ echo "应有尽有#VV 的#DEC 丰富#JJ 选择#NN 定#VV 将#AD 为#P 您#PN 的#DEG 旅程#NN 增添

Stanford.NLP for .NET not loading models

让人想犯罪 __ 提交于 2020-01-02 07:11:23
问题 I am trying to run the sample code provided here for Stanford.NLP for .NET. I installed the package via Nuget, downloaded the CoreNLP zip archive, and extracted stanford-corenlp-3.7.0-models.jar. After extracting, I located the "models" directory in stanford-corenlp-full-2016-10-31\edu\stanford\nlp\models. Here is the code that I am trying to run: public static void Test1() { // Path to the folder with models extracted from `stanford-corenlp-3.6.0-models.jar` var jarRoot = @"..\..\..\stanford

Getting the error while integrating stanford sentiment analysis with java

别来无恙 提交于 2020-01-02 04:33:05
问题 I am working on sentiment analysis using stanford sentiment nlp library with java. But when I am executing the code I am getting the error. Not able to figure it out. My code is as follows: package com.nlp; import java.util.Properties; import edu.stanford.nlp.ling.CoreAnnotations; import edu.stanford.nlp.pipeline.Annotation; import edu.stanford.nlp.pipeline.StanfordCoreNLP; import edu.stanford.nlp.rnn.RNNCoreAnnotations; import edu.stanford.nlp.sentiment.SentimentCoreAnnotations; import edu

Triple extraction from a sentance

别来无恙 提交于 2020-01-01 19:24:19
问题 I have this parsed text in this format, I got it by using Standford nlp. (ROOT (S (NP (DT A) (NN passenger) (NN plane)) (VP (VBZ has) (VP (VBD crashed) (ADVP (RB shortly)) (PP (IN after) (NP (NP (NN take-off)) (PP (IN from) (NP (NNP Kyrgyzstan) (`` `) (NNP scapital) (, ,) (NNP Bishkek))))) (, ,) (VP (VBG killing) (NP (NP (DT a) (JJ large) (NN number)) (PP (IN of) (NP (NP (DT those)) (PP (IN on) (NP (NN board))))))))) (. .))) det(plane-3, A-1) nn(plane-3, passenger-2) nsubj(crashed-5, plane-3)

Setting NLTK with Stanford NLP (both StanfordNERTagger and StanfordPOSTagger) for Spanish

China☆狼群 提交于 2020-01-01 12:11:32
问题 The NLTK documentation is rather poor in this integration. The steps I followed were: Download http://nlp.stanford.edu/software/stanford-postagger-full-2015-04-20.zip to /home/me/stanford Download http://nlp.stanford.edu/software/stanford-spanish-corenlp-2015-01-08-models.jar to /home/me/stanford Then in a ipython console: In [11]: import nltk In [12]: nltk.__version__ Out[12]: '3.1' In [13]: from nltk.tag import StanfordNERTagger Then st = StanfordNERTagger('/home/me/stanford/stanford

Splitting chinese document into sentences [closed]

99封情书 提交于 2020-01-01 11:50:32
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 2 years ago . I have to split Chinese text into multiple sentences. I tried the Stanford DocumentPreProcessor. It worked quite well for English but not for Chinese. Please can you let me know any good sentence splitters for Chinese preferably in Java or Python. 回答1: Using some regex tricks in Python (c.f. a modified regex of

How to get phrase tags in Stanford CoreNLP?

假如想象 提交于 2019-12-31 01:50:10
问题 If I want to get phrase tags corresponding each word, how to I get this? For example : In this sentence, My dog also likes eating sausage. I can get a parse tree in Stanford NLP such as (ROOT (S (NP (PRP$ My) (NN dog)) (ADVP (RB also)) (VP (VBZ likes) (NP (JJ eating) (NN sausage))) (. .))) In the above situtation, I want to get phrase tags corresponding each word like (My - NP), (dog - NP), (also - ADVP), (likes - VP), ... Is there any method for this simple extraction for phrase tags? Please

Nltk stanford pos tagger error : Java command failed

北战南征 提交于 2019-12-30 04:22:08
问题 I'm trying to use nltk.tag.stanford module for tagging a sentence (first like wiki's example) but i keep getting the following error : Traceback (most recent call last): File "test.py", line 28, in <module> print st.tag(word_tokenize('What is the airspeed of an unladen swallow ?')) File "/usr/local/lib/python2.7/dist-packages/nltk/tag/stanford.py", line 59, in tag return self.tag_sents([tokens])[0] File "/usr/local/lib/python2.7/dist-packages/nltk/tag/stanford.py", line 81, in tag_sents