tess4j

Tesseract - ERROR net.sourceforge.tess4j.Tesseract - null

丶灬走出姿态 提交于 2021-02-18 10:23:08
问题 Created a java application that uses Tesseract in order to convert a given image or pdf to a string format, when running it on my machine as a unit test using junit it runs great but when running the full system which is a restFul API run by tomcat that receives the image and runs Tesseract it gives me the following error: 23:22:36.511 [http-nio-9999-exec-3] ERROR net.sourceforge.tess4j.Tesseract - null java.lang.NullPointerException: null at net.sourceforge.tess4j.util.PdfUtilities

Tess4j - Pdf to Tiff to tesseract - “Warning: Invalid resolution 0 dpi. Using 70 instead.”

梦想与她 提交于 2020-02-06 07:25:51
问题 I am usig tess4j (net.sourceforge.tess4j:tess4j:4.4.0) and try OCR on pdf files. So as I understood I have to transform the pdf first to tiff or png (any of those suggested?) what I did like this: tesseract.doOCR(PdfUtilities.convertPdf2Tiff(inputPdfFile)); and get following warning: Warning: Invalid resolution 0 dpi. Using 70 instead. Question Does it has any influence on my scan results? (if not, ok - I can switch off the warning) Is there a way to set the DPI by hand or should convertPdf

Tess4j - Pdf to Tiff to tesseract - “Warning: Invalid resolution 0 dpi. Using 70 instead.”

别等时光非礼了梦想. 提交于 2020-02-06 07:24:07
问题 I am usig tess4j (net.sourceforge.tess4j:tess4j:4.4.0) and try OCR on pdf files. So as I understood I have to transform the pdf first to tiff or png (any of those suggested?) what I did like this: tesseract.doOCR(PdfUtilities.convertPdf2Tiff(inputPdfFile)); and get following warning: Warning: Invalid resolution 0 dpi. Using 70 instead. Question Does it has any influence on my scan results? (if not, ok - I can switch off the warning) Is there a way to set the DPI by hand or should convertPdf

java.lang.IllegalAccessError: tried to access method net.sourceforge.tess4j.Tesseract.<init>()V from class Tess4jTest.TestTess

岁酱吖の 提交于 2020-01-25 09:13:05
问题 I did a Java OCR project with Tesseract in the Mirth .When I run the jar file from the Mirth,I get this error.When I search it,I found that there is a init() method and also it is a protected void in Tesseract.java.I think that maybe it is the reason for that error. What should I do?Thank you so much for your helps. package Tess4jTest; import java.io.File; import java.io.IOException; import net.sourceforge.tess4j.*; public class TestTess { public static String Tc; public static String phone;

OCR : Not getting desired result

雨燕双飞 提交于 2020-01-24 01:19:29
问题 I have this image . I am trying to OCR the letters in this image. I am not getting desired result for letters '9' and 'R'. First I cropped these letters, & and executing following command. tesseract 9.png stdout -psm 8 . It is just returning "." OCR for all other letters are working fine but not for these two letters(though, I think their image quality is not that bad). Any suggestion/help is appreciated. 回答1: I've no experience with tesseract myself, but replicating the character and adding

Tess4J Mac: NoClassDefFoundError

爷,独闯天下 提交于 2020-01-03 20:59:26
问题 I'm trying to use Tess4J in my project. It doesn't include .dylib files for Mac, so I've built my own Tesseract and am using the .dylib from the Tesseract build. I'm able to load the native library with no issue, and I believe have the Tess4J library linked properly, since I can import it with no issue. However, when I try to create a new instance of Tesseract using: Tesseract t = new Tesseract(); I'm getting the following error: Exception in thread "main" java.lang.NoClassDefFoundError: com

Tess4j on Windows 64-bit: exception on multiple threads

假装没事ソ 提交于 2020-01-01 15:24:38
问题 I am using tesseract 3 with Java 8 on Windows 64-bit to OCR scanned PDFs. I have followed the instructions on the Tess4j page and have used the 64-bit versions of the required DLLs, and have installed 64-bit Ghostscript. When I run my unit test with the normal @Test (no arguments), the code runs correctly , so I guess I have installed everything correctly. When I run it with 2 threads in parallel (see below) I get an exception. I have read the relevant thread here, but there it is suggested

Tess4j on Windows 64-bit: exception on multiple threads

二次信任 提交于 2020-01-01 15:24:06
问题 I am using tesseract 3 with Java 8 on Windows 64-bit to OCR scanned PDFs. I have followed the instructions on the Tess4j page and have used the 64-bit versions of the required DLLs, and have installed 64-bit Ghostscript. When I run my unit test with the normal @Test (no arguments), the code runs correctly , so I guess I have installed everything correctly. When I run it with 2 threads in parallel (see below) I get an exception. I have read the relevant thread here, but there it is suggested

How to ignore special characters in Tesseract OCR using java

末鹿安然 提交于 2019-12-26 06:33:44
问题 I have extracted text from image through Tesseract OCR using java. But the output is consisting of some special characters because image contains some symbols. I want to ignore all the special characters and display just text. Is there any way that i can do that? 回答1: In tesseract you can set TessBaseAPI.VAR_CHAR_WHITELIST and TessBaseAPI.VAR_CHAR_BLACKLIST in order to ignore some special characters. Following would make tesseract only recognize A-Z and digits String whiteList =

How to ignore special characters in Tesseract OCR using java

痞子三分冷 提交于 2019-12-26 06:32:24
问题 I have extracted text from image through Tesseract OCR using java. But the output is consisting of some special characters because image contains some symbols. I want to ignore all the special characters and display just text. Is there any way that i can do that? 回答1: In tesseract you can set TessBaseAPI.VAR_CHAR_WHITELIST and TessBaseAPI.VAR_CHAR_BLACKLIST in order to ignore some special characters. Following would make tesseract only recognize A-Z and digits String whiteList =