opennlp

Is there a way to get the subject of a sentence using OpenNLP?

假如想象 提交于 2019-11-30 00:34:51
Is there a way to get the subject of a sentence using OpenNLP? I'm trying to identify the most important part of a users sentence. Generally, users will be submitting sentences to our "engine" and we want to know exactly what the core topic is of that sentence. Currently we are using openNlp to: Chunk the sentence Identify the noun-phrase, verbs, etc of the sentence Identify all "topics" of the sentence (NOT YET DONE!) Identify the "core topic" of the sentence Please let me know if you have any bright ideas.. Dependency Parser If you're interested in extracting grammatical relations such as

How to use OpenNLP with Java?

大城市里の小女人 提交于 2019-11-29 20:29:59
I want to POStag an English sentence and do some processing. I would like to use openNLP. I have it installed When I execute the command I:\Workshop\Programming\nlp\opennlp-tools-1.5.0-bin\opennlp-tools-1.5.0>java -jar opennlp-tools-1.5.0.jar POSTagger models\en-pos-maxent.bin < Text.txt It gives output POSTagging the input in Text.txt Loading POS Tagger model ... done (4.009s) My_PRP$ name_NN is_VBZ Shabab_NNP i_FW am_VBP 22_CD years_NNS old._. Average: 66.7 sent/s Total: 1 sent Runtime: 0.015s I hope it installed properly? Now how do i do this POStagging from inside a java application? I

Entity Extraction/Recognition with free tools while feeding Lucene Index

时光总嘲笑我的痴心妄想 提交于 2019-11-29 18:40:24
I'm currently investigating the options to extract person names, locations, tech words and categories from text (a lot articles from the web) which will then feeded into a Lucene/ElasticSearch index. The additional information is then added as metadata and should increase precision of the search. E.g. when someone queries 'wicket' he should be able to decide whether he means the cricket sport or the Apache project. I tried to implement this on my own with minor success so far. Now I found a lot tools, but I'm not sure if they are suited for this task and which of them integrates good with

OpenNLP: foreign names does not get recognized

泄露秘密 提交于 2019-11-29 14:25:09
I just started using openNLP to recognize names. I am using the model (en-ner-person.bin) that comes with open NLP. I noticed that while it recognizes us, uk, and european names, it fails to recognize Indian or Japanese names. My questions are (1) is there already models available that I can use to recognize foreign names (2) If not, then I believe I will need to generate new models. In that case, is there a copora available that I can use? markgiaconia You can make your own model with your data using an opennlp addon called modelbuilder-addon, if you try it you may be the first one to do so

could not find function tagPOS

你离开我真会死。 提交于 2019-11-29 07:59:36
Trying to tag a sentence using openNLP. library(openNLP) str <- "this is a the first sentence." tagged_str <- tagPOS(str) Getting the following error: Error: could not find function "tagPOS" Any suggestions? Thanks. I think tagPOS is not a built in function of any of the package, so you'll have to add the function. Here is the R Code: library(NLP) library(openNLP) tagPOS <- function(x, ...) { s <- as.String(x) word_token_annotator <- Maxent_Word_Token_Annotator() a2 <- Annotation(1L, "sentence", 1L, nchar(s)) a2 <- annotate(s, word_token_annotator, a2) a3 <- annotate(s, Maxent_POS_Tag

How to detect that two sentences are similar?

折月煮酒 提交于 2019-11-28 21:54:07
问题 I want to compute how similar two arbitrary sentences are to each other. For example: A mathematician found a solution to the problem. The problem was solved by a young mathematician. I can use a tagger, a stemmer, and a parser, but I don’t know how detect that these sentences are similar. 回答1: These two sentences are not just similar, they are almost paraphrases, i.e., two alternative ways of expressing the same meaning. It is also a very simple case of paraphrase, in which both utterances

Is there a way to get the subject of a sentence using OpenNLP?

筅森魡賤 提交于 2019-11-28 21:29:39
问题 Is there a way to get the subject of a sentence using OpenNLP? I'm trying to identify the most important part of a users sentence. Generally, users will be submitting sentences to our "engine" and we want to know exactly what the core topic is of that sentence. Currently we are using openNlp to: Chunk the sentence Identify the noun-phrase, verbs, etc of the sentence Identify all "topics" of the sentence (NOT YET DONE!) Identify the "core topic" of the sentence Please let me know if you have

How to use OpenNLP with Java?

我的梦境 提交于 2019-11-28 17:09:54
问题 I want to POStag an English sentence and do some processing. I would like to use openNLP. I have it installed When I execute the command I:\Workshop\Programming\nlp\opennlp-tools-1.5.0-bin\opennlp-tools-1.5.0>java -jar opennlp-tools-1.5.0.jar POSTagger models\en-pos-maxent.bin < Text.txt It gives output POSTagging the input in Text.txt Loading POS Tagger model ... done (4.009s) My_PRP$ name_NN is_VBZ Shabab_NNP i_FW am_VBP 22_CD years_NNS old._. Average: 66.7 sent/s Total: 1 sent Runtime: 0

Entity Extraction/Recognition with free tools while feeding Lucene Index

 ̄綄美尐妖づ 提交于 2019-11-28 13:23:05
问题 I'm currently investigating the options to extract person names, locations, tech words and categories from text (a lot articles from the web) which will then feeded into a Lucene/ElasticSearch index. The additional information is then added as metadata and should increase precision of the search. E.g. when someone queries 'wicket' he should be able to decide whether he means the cricket sport or the Apache project. I tried to implement this on my own with minor success so far. Now I found a

How to create a good NER training model in OpenNLP?

霸气de小男生 提交于 2019-11-28 07:32:29
I just have started with OpenNLP. I need to create a simple training model to recognize name entities. Reading the doc here https://opennlp.apache.org/docs/1.8.0/apidocs/opennlp-tools/opennlp/tools/namefind I see this simple text to train the model: <START:person> Pierre Vinken <END> , 61 years old , will join the board as a nonexecutive director Nov. 29 . Mr . <START:person> Vinken <END> is chairman of Elsevier N.V. , the Dutch publishing group . <START:person> Rudolph Agnew <END> , 55 years old and former chairman of Consolidated Gold Fields PLC , was named a director of this British