tesseract

Forcing Tesseract to match pattern (four digits in a row)

寵の児 提交于 2019-12-22 11:37:52
问题 I'm trying to get Tesseract (using the Tess4J wrapper) to match only a specific pattern. The pattern is four digits in a row, which I think would be \d\d\d\d. Here is a VERY small subset of the image I'm feeding tesseract (the floorplans are restricted, so I'm cautious to post much more of it): http://mike724.com/view/a06771 I'm using the following java code: File imageFile = new File("/<redacted>/file.pdf"); Tesseract instance = Tesseract.getInstance(); instance.setTessVariable("load_system

CMake for Tesseract and OpenCV

江枫思渺然 提交于 2019-12-22 11:12:54
问题 I am new to Linux programming, I am trying create an OCR application on Ubuntu 12.10 using Tesseract and OpenCV. So far I have setup tesseract and OpenCV on linux also I have followed this tutorial, in this tutorial I found it very easy that we create one file CMakeList.txt and link OpenCV in it. Now I am trying to compile tesseract-ocr library with this code. As I know I did not make a link between tesseract-ocr and my code and thats why I am having errors. All I want and searching for is if

Java exception- Exception in thread “main” java.lang.NoClassDefFoundError: net/sourceforge/tess4 j/Tesseract

空扰寡人 提交于 2019-12-22 10:39:11
问题 I am try to make things works with tess4j (OCR algorithm), and i m using this code: import java.awt.image.RenderedImage; import java.io.File; import java.net.URL; import javax.imageio.ImageIO; import net.sourceforge.tess4j.*; public static void main(String[] args) throws Exception{ URL imageURL = new URL("http://s4.postimg.org/e75hcme9p/IMG_20130507_190237.jpg"); RenderedImage img = ImageIO.read(imageURL); File outputfile = new File("saved.png"); ImageIO.write(img, "png", outputfile); try {

How to detect tables in images using tesseract 4.0 or using pytesseract?

筅森魡賤 提交于 2019-12-22 09:56:45
问题 I want to detect tables in images. Identify the blocks of tables and possibly the text within it. In previous versions of tesseract, one could use the parameter textord_dump_table_image. How to extract tables in tesseract 4.0? 回答1: It is quite bizarre that there is currently no API available to directly get table regions in tesseract. However you can use a small hack 'coughs' to get the table coordinates. There is a configuration option textord_show_tables for tesseract. Set it to true using

OCR - how to get text from outlined words

一曲冷凌霜 提交于 2019-12-22 08:28:07
问题 I have an image of text, where the words are outlined rather than filled in. Tesseract is struggling to get any of the words correct - does anyone have a solution to these types of problems? I have tried simple operations like inversion, but to no affect. I'm guessing tesseract already handles this. Img example: Typical output for Next: New Typical output for Previous: Pflevuows (my very simple) Code, takes the image as an argument: import pytesseract import sys from PIL import Image print

Crop pictures with Leptonica API -> OR which image processing Lib to use?

▼魔方 西西 提交于 2019-12-22 08:22:21
问题 I'm trying to do two things -> First I need to read in an image and crop it ( coordinates / frame will be provided by the user ). Then I want to run an OCR over it. ( Actually the cropping an the OCR shall be strictly divided ). Now to my problem: For the OCR I'm using Tesseract, which is using the Leptonica API for the image processing. Since I'm programing for an embedded device I want to keep the count of different libraries low. So my best interest is to crop my image with Leptonica, so I

OCR for Equations and Formulae on the iOS Platform (Xcode)

▼魔方 西西 提交于 2019-12-22 04:47:11
问题 I'm currently developing an application which uses the iOS enabled device camera to recognise equations from the photo and then match these up to the correct equation in a library or database - basically an equation scanner. For example you could scan an Image of the Uncertainty Principle or Schrodinger Equation and the iOS device would be able to inform the user it's name and certain feedback. I was wondering how to implement this using Xcode, I was thinking of using an open-source framework

Can tesseract be trained for non-font symbols?

时光毁灭记忆、已成空白 提交于 2019-12-22 04:38:13
问题 I'm curious about how I may be able to more reliably recognise the value and the suit of playing card images. Here are two examples: There may be some noise in the images, but I have a large dataset of images that I could use for training (roughly 10k pngs, including all values & suits). I can reliably recognise images that I've manually classified, if I have a known exact-match using a hashing method. But since I'm hashing images based on their content, then the slightest noise changes the

Can `tesseract-ocr` put the result to STDOUT?

蹲街弑〆低调 提交于 2019-12-22 01:27:24
问题 Using tesseract-ocr #3.02.02. The basic usage of tesseract is tesseract sourc.png result and result.txt is generated. To get the result text, I have to cat this file. Is there any options to dump the result in stdout? 回答1: You should upgrade to v3.03 where support for stdout was added. 回答2: The solution is: tesseract input.jpg stdout But you need at least version 3.03 来源: https://stackoverflow.com/questions/24347819/can-tesseract-ocr-put-the-result-to-stdout

How to get skew angle from image

半城伤御伤魂 提交于 2019-12-22 00:00:10
问题 I am facing problem to get the skew angle from image .I am using tesseract api for image processing. I have searched a lot on web but no appropriate solution found. I have used following code: Pix test=ReadFile.readBitmap(bitmap.createBitmap(400, 400, Config.ARGB_8888)); float angle=Skew.findSkew(test); from above code I get angle value 0.0. Please help me to resolve this problem or show the right direction to resolve this problem. 回答1: TessBaseAPI baseApi = new TessBaseAPI(); baseApi