uima | 易学教程

Maximum size for a single Wordlist-UIMA RUTA

阅读更多关于 Maximum size for a single Wordlist-UIMA RUTA

问题 What is the maximum size for a wordlist in Uima Ruta? Because I want to store list of countries, states and cities name. 回答1: There is no maximum size for the wordlists in UIMA Ruta. The lines of the file are normally transferred into a char-based in-memory tree structure (TRIE). This means that the size is only restricted by the available RAM and it's memory consumption is less than linear. My largest wordlist consisted of about 500k entries, as far as I remember. So a list of country names

Why do I get “Editor could not be initialized” error while running UIMA Ruta scripts?

阅读更多关于 Why do I get “Editor could not be initialized” error while running UIMA Ruta scripts?

问题 I often get errors like this while running UIMA Ruta scripts. Why so ? What can I do to prevent it ? Does it depend on my code or is it related to Eclipse IDE ? Error: Editor could not be initialized. org.apache.uima.UIMARuntimeException at org.apache.uima.util.CasIOUtils.load(CasIOUtils.java:368) at org.apache.uima.util.CasIOUtils.load(CasIOUtils.java:312) at org.apache.uima.util.CasIOUtils.load(CasIOUtils.java:193) at org.apache.uima.util.CasIOUtils.load(CasIOUtils.java:218) at org.apache

Why do I get “Editor could not be initialized” error while running UIMA Ruta scripts?

阅读更多关于 Why do I get “Editor could not be initialized” error while running UIMA Ruta scripts?

UIMA Ruta, uimaFIT and DKPro: Which versions work together?

阅读更多关于 UIMA Ruta, uimaFIT and DKPro: Which versions work together?

问题 In the GSCL 2013 Ruta tutorial the versions of the components in the pom.xml are: uimaj-core: 2.4.2 DKPro components: 1.5.0 ruta-core: 2.1.0 Now, I incremented the version numbers incrementally and found that version 1.8.0 of the DKPro components introduces the following exception: Exception in thread "main" java.lang.NoSuchMethodError: org.apache.uima.cas.text.AnnotationIndex.withSnapshotIterators()Lorg/apache/uima/cas/FSIndex; at org.apache.uima.fit.util.FSCollectionFactory

Do I need to rewrite my entire java project if I want to use a single UIMA-dependent library?

阅读更多关于 Do I need to rewrite my entire java project if I want to use a single UIMA-dependent library?

问题 I want to use https://code.google.com/p/heideltime/ in a java project. That code "fits into the UIMA pipeline", which is something I don't understand at all. UIMA looks like it's designed to solve a ton of problems that I don't have, so I'd just like to get the minimal amount of UIMA needed to run that code. Is there a simple example out there of how I can run a simple UIMA program? I've added <dependency> <groupId>org.uimafit</groupId> <artifactId>uimafit</artifactId> <version>1.4.0</version

How to get the annotated text for a DictionaryAnnotator

阅读更多关于 How to get the annotated text for a DictionaryAnnotator

问题 I have a dictionary created from the DictionaryCreator from UIMA, I would like to annotate a piece of text using the DictionaryAnnotator and the aforementioned dictionary, I could not figure out how to get the annotated text. Please let me know if you do. Any help is appreciated. The code, the dictionary-file and the descriptor is mentioned below, P.S. I'm new to Apache UIMA. XMLInputSource xml_in = new XMLInputSource("DictionaryAnnotatorDescriptor.xml"); ResourceSpecifier specifier =

Reusable version of DKPro Core pipeline

阅读更多关于 Reusable version of DKPro Core pipeline

问题 I have set up DKPro Core as a web service to take an input and provide a tokenised output. The service itself is set up as a Jersey resource: @Path("/") public class MyResource { public MyResource() { // Nothing here } @GET public String generate(@QueryParam("q") final String input) { try { final JCasIterable en = iteratePipeline( createReaderDescription(StringReader.class, StringReader.PARAM_DOCUMENT_TEXT, input, StringReader.PARAM_LANGUAGE, "en") ,createEngineDescription(StanfordSegmenter

UIMA with Spark

阅读更多关于 UIMA with Spark

问题 as said in here there are some overlap between UIMA and spark in distribution infrastructures. I was planning to use UIMA with spark. (now i am moving to UIMAFit) Can any one tell me what are the problems we really face when we develop uima with spark. And what are the possible encounters. (Sorry I haven't done any research on this.) 回答1: The main problem is accessing objects because UIMA tries to re instantiate objects when running their analyse engines. if the objects has local references

How to run external ruta scripts from a maven project without placing the script or its typesystem in the classpath?

阅读更多关于 How to run external ruta scripts from a maven project without placing the script or its typesystem in the classpath?

问题 Till now, I had been running ruta scripts from a maven project by creating AnalysisEngine and CAS, and processing the engine. To do this, I had placed all the scripts and descriptor files (Engine & TypeSystem) into scr/main/resources folder of the maven project. Now I want to place the scripts and TypeSystem files in an external path and pass the path dynamically to my java code that runs the scripts. Is it possible to do it ? If so, how ? I simply placed the files(script & descriptor) in an

CPU usage too high while running Ruta Script

阅读更多关于 CPU usage too high while running Ruta Script

问题 CPU usage too high while running Ruta Script.So I plan to use GPU. Whether I need to do any additional process to run the script in GPU machine. Orelse is there any alternative solution to reduce the CPU usage Sample Script: PACKAGE uima.ruta.example; ENGINE utils.PlainTextAnnotator; TYPESYSTEM utils.PlainTextTypeSystem; WORDLIST EditorMarkerList = 'EditorMarker.txt'; WORDLIST EnglishStopWordList = 'EnglishStopWords.txt'; WORDLIST FirstNameList = 'FirstNames.txt'; WORDLIST