OCR Tesseract - Tess4J behaving weirdly

风流意气都作罢 提交于 2019-12-12 01:55:42

问题


I am trying to extract text out of an image. The issue is that I am using the below given code to process the image and print the extracted text.

    public class Test { 

    public static void extractText(String filename)
    //    public static void main(String[] args)
    {
        System.setProperty("jna.library.path", "32".equals(System.getProperty("sun.arch.data.model")) ? "lib/win32-x86" : "lib/win32-x86-64");

        File imageFile = new File("img_perspective.png");
        Tesseract instance = Tesseract.getInstance();  // JNA Interface Mapping
    //  Tesseract1 instance = new Tesseract1(); // JNA Direct Mapping

        try {
            String result = instance.doOCR(imageFile);
            System.out.println(result);
        } catch (Exception e) {
            System.err.println(e.getMessage());
        }
    }
}

When I use the main method, the OCR engine works very well and extracts the text. But when I am trying to convert this main method to the method named "extractText()" and trying to call it from another class, it throws exception:

org.apache.catalina.core.StandardWrapperValve invoke
SEVERE: Servlet.service() for servlet [com.patternrecognition.preprocessing.Preprocessing] in context with path  [/ImagePreprocessing] threw exception [Servlet execution threw an exception] with root cause
java.lang.ClassNotFoundException: com.sun.media.imageio.plugins.tiff.TIFFImageWriteParam
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1718)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1569)
at net.sourceforge.tess4j.Tesseract.doOCR(Unknown Source)
at net.sourceforge.tess4j.Tesseract.doOCR(Unknown Source)
at com.patternrecognition.preprocessing.test.extractText(test.java:19)

I don't know what is wrong with this. I am using exactly the same code with hardcoded filename. Only the method is changed.

This is so frustrating. Can someone help please.


回答1:


Make sure jai-imageio.jar is in the classpath. And call ImageIO.scanForPlugins(); before OCR.

tess4j with Spring mvc




回答2:


If you are using a web application, make sure the jai-imageio.jar is in WEB-INF/lib folder. It worked for me.



来源:https://stackoverflow.com/questions/27027880/ocr-tesseract-tess4j-behaving-weirdly

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!