I am using tess4j, the java wrapper of Tesseract. I also have the normal Tesseract installed. I am not exactly sure how tess4j is meant to work, but since it comes with a te
Maybe you haven't the tessdata folder in your main project folder.
This folder has all tesseract supported language (it contains files with .traineddata, .bigrams, .fold, .lm, .nn, .params, .size and .word-freq extensions)
If you don't have it, follow these steps:
tessdata-master.zip file in your main project foldertessdata-master to tessdataFor those that use maven and don't like to use global variables, this works for me:
File imageFile = new File("C:\\random.png");
Tesseract instance = Tesseract.getInstance();
//In case you don't have your own tessdata, let it also be extracted for you
File tessDataFolder = LoadLibs.extractTessResources("tessdata");
//Set the tessdata path
instance.setDatapath(tessDataFolder.getAbsolutePath());
try {
String result = instance.doOCR(imageFile);
System.out.println(result);
} catch (TesseractException e) {
System.err.println(e.getMessage());
}
found here, tested with maven -> net.sourceforge.tess4j:tess4j:3.4.1, also the link use 1.4.1 jar
TESSDATA_PREFIX environment variable, if defined, will overrule everything, including that is set by init or setDatapath; but that may change in the near future when an application can specify where its tessdata folder is.
http://code.google.com/p/tesseract-ocr/issues/detail?id=938
https://groups.google.com/forum/#!topic/tesseract-ocr/bkJwI8WmxSw
Let your TESSDATA_PREFIX environment variable point to the tessdata folder of your Tess4j.
Usually you set up these variable during an installation on the system, but you maybe find a solution here: How do I set environment variables from Java?
You have to do it on the system which runs your app because the tessdata .dlls depend on this enviroment variable.