opennlp

OpenNLP: Training a custom NER Model for multiple entities

泄露秘密 提交于 2020-01-23 10:59:22
问题 I am trying training a custom NER model for multiple entities. Here is the sample training data: count all <START:item_type> operating tables <END> on the <START:location_id> third <END> <START:location_type> floor <END> count all <START:item_type> items <END> on the <START:location_id> third <END> <START:location_type> floor <END> how many <START:item_type> beds <END> are in <START:location_type> room <END> <START:location_id> 2 <END> The NameFinderME.train(.) method takes a string parameter

NoClassDefFoundError: opennlp/tools/chunker/ChunkerModel

偶尔善良 提交于 2020-01-17 05:19:49
问题 Got this error while trying opennlp chunking: NoClassDefFoundError: opennlp/tools/chunker/ChunkerModel Here is the basic code: import java.io.*; import opennlp.tools.chunker.*; public class test{ public static void main(String[] args) throws IOException{ ChunkerModel model = null; InputStream modelIn = new FileInputStream("en-parser-chunking.bin"); model = new ChunkerModel(modelIn); } } 回答1: I don't see any NLP-specific reasons here, so just check tutorials about NoClassDefFoundError, for

opennlp sample training data for disease

笑着哭i 提交于 2020-01-17 05:03:32
问题 I'm using OpenNLP for data classification. I could not find TokenNameFinderModel for disease here. I know I can create my own model but I was wondering is there any large sample training data available for disease? 回答1: You can easily create your own training data-set using the modelbuilder addon and follow some rules as mentioned here to train create a good NER model. you can find some help using modelbuilder addon here. It is basically, you put all the information in a text file and the NER

Apache OpenNLP: java.io.FileInputStream cannot be cast to opennlp.tools.util.InputStreamFactory

狂风中的少年 提交于 2020-01-13 13:09:11
问题 I am trying to build a custom NER using Apache OpenNLP 1.7. From the documentation available Here, I have developed the following code import java.io.BufferedOutputStream; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.nio.charset.Charset; import opennlp.tools.namefind.NameFinderME; import opennlp.tools.namefind.NameSample; import opennlp.tools.namefind.NameSampleDataStream; import opennlp.tools.namefind.TokenNameFinderFactory; import

Why is a self trained NER-Model incompatible with the version of OpenNLP?

*爱你&永不变心* 提交于 2020-01-03 05:48:28
问题 I trained OpenNLP NER-Model to detect a new Entity but when I am using this model I encountered the following Exception: Exception in thread "main" java.lang.IllegalArgumentException: opennlp.tools.util.InvalidFormatException: Model version 1.6.0 is not supported by this (1.5.3) version of OpenNLP! I am using OpenNLP version 1.6.0 and my source code is this: String [] sentences = Fragmentation.getSentences(Document); InputStream modelIn = new FileInputStream("Models/en-ner-cvskill.bin");

How to get the annotated text for a DictionaryAnnotator

[亡魂溺海] 提交于 2020-01-03 05:23:15
问题 I have a dictionary created from the DictionaryCreator from UIMA, I would like to annotate a piece of text using the DictionaryAnnotator and the aforementioned dictionary, I could not figure out how to get the annotated text. Please let me know if you do. Any help is appreciated. The code, the dictionary-file and the descriptor is mentioned below, P.S. I'm new to Apache UIMA. XMLInputSource xml_in = new XMLInputSource("DictionaryAnnotatorDescriptor.xml"); ResourceSpecifier specifier =

How to extract sentences containing specific person names using R

半城伤御伤魂 提交于 2020-01-01 09:33:08
问题 I am using R to extract sentences containing specific person names from texts and here is a sample paragraph: Opposed as a reformer at Tübingen, he accepted a call to the University of Wittenberg by Martin Luther, recommended by his great-uncle Johann Reuchlin. Melanchthon became professor of the Greek language in Wittenberg at the age of 21. He studied the Scripture, especially of Paul, and Evangelical doctrine. He was present at the disputation of Leipzig (1519) as a spectator, but

Training own model in opennlp

南楼画角 提交于 2019-12-30 00:51:09
问题 I am finding it difficult to create my own model openNLP. Can any one tell me, how to own model. How the training shouls be done. What should be the input and where the output model file will get stored. 回答1: https://opennlp.apache.org/docs/1.5.3/manual/opennlp.html This website is very useful, shows both in code, and using the OpenNLP application to train models for all different types, like entity extraction and part of speech etc. I could give you some code examples in here, but the page

OpenNLP Name Entity Recognizer output

血红的双手。 提交于 2019-12-25 05:33:14
问题 I have trained an OpenNLP Name Entity Recognizer. When I use it over some data it gives an output like: [0..1) location I rather want to output the original name that occurred in the data. 回答1: this is a Span objects toString() output. Each call to find(String[]) can return multiple Spans, hence the find() method returns Span[]. Use this code to get the actual named entities //"tokens" here is the String[] of words in your sentence Span[] find = nf.find(tokens); //use the Span's static method

how to use opennlp on eclipse

假如想象 提交于 2019-12-24 12:33:54
问题 I am trying to install opennlp so i can use it for my nlp course project. I have eclipse kepler on my windows 8 computer i read so many online pages about how to install it but no luck I read http://sharpnlp.codeplex.com/discussions/263620 http://sharpnlp.codeplex.com/discussions/263620 and many other links that the website won't allow me to add but non of them seems to help me what i did is the following: Microsoft Windows [Version 6.2.9200] (c) 2012 Microsoft Corporation. All rights