tesseract | 易学教程

Assert failed - Training Tesseract

阅读更多关于 Assert failed - Training Tesseract

I'm trying to train tesseract with Serak Tesseract Trainer: https://code.google.com/p/serak-tesseract-trainer/ and I can't figure out why the following error in the CMD is happening while executing Train Tesseract. Any help? Reading a.tr ... Font id = -1/0, class id = 1/46 on sample 0 font_id >= 0 && font_id < font_id_map_.SparseSize():Error:Assert failed:in file ..\classify\trainingsampleset.cpp, line 622 Before writing your font data put '\n' char beginig of the file(just hit enter). Worked for me. 来源： https://stackoverflow.com/questions/18921810/assert-failed-training-tesseract

android - recognized text from tess-two library is wrong

阅读更多关于 android - recognized text from tess-two library is wrong

I am trying to use the tess-two library to recognize text from imagae. Here is my code: load.setOnClickListener(new View.OnClickListener() { @Override public void onClick(View v) { // recognize text Bitmap temp = loadJustTakenImage(); //loads taken image from sdcard Bitmap rotatedImage = rotateIfNeeded(temp); // rotate method i found in some tutorial String text1 = recognizeText(rotatedImage); } }); Recognize text method: (tessdata folder is in Download with the eng.traineddata and other files) private String recognizeText(Bitmap bitmap) { // TODO Auto-generated method stub TessBaseAPI baseApi

fatal error: strtok_r.h: No such file or directory (while compiling tesseract-ocr-3.01 in MinGW)

阅读更多关于 fatal error: strtok_r.h: No such file or directory (while compiling tesseract-ocr-3.01 in MinGW)

问题 I'm compiling tesseract-ocr-3.01 in MinGW, and I'm getting this error ambigs.cpp:31:22: fatal error: strtok_r.h: No such file or directory This is the code where the error is: #ifdef WIN32 #ifndef __GNUC__ #define strtok_r strtok_s #else #include "strtok_r.h" #endif /* __GNUC__ */ #endif /* WIN32 */ Edit I found this feature request to add strtok_r.h to MinGW. From the comments there: strtok_r() is an optional POSIX function, required only for implementations which support POSIX threads.

How do I get accurate text using Tesseract OCR in iOS?

阅读更多关于 How do I get accurate text using Tesseract OCR in iOS?

问题 I am working on iPhone application.Here I need to get text from the images, after googling I found Tesseract can do that.Its working fine but not getting accurate results.I used this and processed the image but still not getting good results. Tesseract* tesseract = [[Tesseract alloc] initWithDataPath:@"tessdata" language:@"eng"]; UIImage *selectedImage=[UIImage imageNamed:@"download.jpg"]; [tesseract setImage:selectedImage]; ImageWrapper *greyScale=Image::createImage(selectedImage,

How to hide the console window when I run tesseract with pytesseract with CREATE_NO_WINDOW

阅读更多关于 How to hide the console window when I run tesseract with pytesseract with CREATE_NO_WINDOW

问题 I am using tesseract to perform OCR on screengrabs. I have an app using a tkinter window leveraging self.after in the initialization of my class to perform constant image scrapes and update label, etc values in the tkinter window. I have searched for multiple days and can't find any specific examples how to leverage CREATE_NO_WINDOW with Python3.6 on a Windows platform calling tesseract with pytesseract. This is related to this question: How can I hide the console window when I run tesseract

Directory: assets/tessdata

阅读更多关于 Directory: assets/tessdata

I've downloaded an OCR text recognizer from github. My problem is: I want to launch my app without being online, but everytime I install the apk on my phone, it starts downloading the english language and the tesseract OCR engine. I've found an online guide which says I have to create a folder in the assets folder called "tessdata" and put the eng.traineddata and the osd.traineddata in this folder. I've tried but the download process still starts when I install the app for the first time. What can I do to make this app completely offline? First, in your project directory in computer (

Tesseract thinks my 1's are 7's

阅读更多关于 Tesseract thinks my 1's are 7's

问题 It seems like this is probably a common issue with ocr. Is there a way to tell tesseract that my 1's are actually 1's? Hopefully without changing my 7's into 1's in the process. Note: these are scanned documents and I have no idea what font was used. 回答1: if "tesseract" is trainable, try to train it on the font manually. It should solve the problem. There is another possible solution. Make a small valdiation module after "tesseracting". For all 1s and 7s, double check them using intensity

Bytes Per Pixel value for byte representation of image in Android

阅读更多关于 Bytes Per Pixel value for byte representation of image in Android

I'm currently writing an Android application which needs to use OCR within it. To achieve this I am using Tesseract in conjunction with the tesseract-android-tools project . I have managed to get the Tesseract API to initialize and need to use the following setImage function: void com.googlecode.tesseract.android.TessBaseAPI.setImage(byte[] imagedata, int width, int height, int bpp, int bpl) What I am struggling with is how to get the correct values for bpp (bytes per pixel) and bpl (bytes per line). Does anyone know how I can get these values? I have put fairly random values in there at the

tesseract OCR in iphone application

阅读更多关于 tesseract OCR in iphone application

问题 I am using tesseract open source engine for OCR to read text from image. But I didn't get 100% result for a single time. Please give your suggestions about quality improvement for OCR using tesseract. Thanks 回答1: here is how to get best result from tesseract Please make sure that you have done preprocessing on image. OVR will produce best results for the images which have following properties: fix DPI (if needed) 300 DPI is minimum fix text size (e.g. 12 pt should be ok) try to fix text lines

Need some advices to learn OCR related techniques [closed]

阅读更多关于 Need some advices to learn OCR related techniques [closed]

问题 It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center. Closed 7 years ago . I'm working on an OCR project for iPhone using tesseract OCR engine. I'm planning to write the following modules: Capture image from iPhone camera Pre-process on the image to refine it, in order to improve the