tesseract | 易学教程

Multiple subprocesses take a lot of time to complete

阅读更多关于 Multiple subprocesses take a lot of time to complete

问题 I have a single process that is run using subprocess module's Popen : result = subprocess.Popen(['tesseract','mypic.png','myop']) st = time() while result.poll() is None: sleep(0.001) en = time() print('Took :'+str(en-st)) Which results in: Took :0.44703030586242676 Here, a tesseract call is made to process an image mypic.png (attached) and output the OCR's result to myop.txt . Now I want this to happen on multiple processes on behalf of this comment (or see this directly), so the code is

How to processing the image for tesserac in java?

阅读更多关于 How to processing the image for tesserac in java?

问题 I am trying to read characters from a image below using Tesseract: And here is my coding for reading the image. Tesseract tesseract = new Tesseract(); try { String text = tesseract.doOCR(new File(path)); // path of your image file System.out.println(text); } catch (TesseractException e) { e.printStackTrace(); } I failed to get the accurate text from the image. So how can i processing the image before reading? 回答1: tesseract is not suitable for captcha breaking. 来源： https://stackoverflow.com

OCR for android application tess4j

阅读更多关于 OCR for android application tess4j

问题 Basically am designing an application that will capture an image from the android devices default camera and will display that image in an image view! works fine! good enough! capt_but.setOnClickListener(new View.OnClickListener() { //@Override // TODO Auto-generated method stub public void onClick(View v) { Intent cameraIntent = new Intent(android.provider.MediaStore.ACTION_IMAGE_CAPTURE); startActivityForResult(cameraIntent, CAMERA_REQUEST); } }); } protected void onActivityResult(int

Training tesseract - shapeclustering issue

阅读更多关于 Training tesseract - shapeclustering issue

问题 I'm trying to train tesseract (adding a new, digit only font) as per the instructions found here: http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3 What I've done: Created a PDF with sample text, converted to tif, ran tesseract num.dot.exp0.tif num.dot.exp0 batch.nochop makebox digits . Then edited the generated box file, correcting wrong detections Ran tesseract on training mode: tesseract num.dot.exp0.tif num.dot.exp0 nobatch box.train and extracted the unicharset with

How can I identify the color of the letters in these images?

阅读更多关于 How can I identify the color of the letters in these images?

问题 I am using this article to solve captchas. It works by removing the background from the image using AForge, and then applying Tesseract OCR to the resulting cleaned image. The problem is, it currently relies on the letters being black, and since each captcha has a different text color, I need to either pass the color to the image cleaner, or change the color of the letters to black. To do either one, I need to know what the existing color of the letters is. How might I go about identifying

Tess4J Mac: NoClassDefFoundError

阅读更多关于 Tess4J Mac: NoClassDefFoundError

问题 I'm trying to use Tess4J in my project. It doesn't include .dylib files for Mac, so I've built my own Tesseract and am using the .dylib from the Tesseract build. I'm able to load the native library with no issue, and I believe have the Tess4J library linked properly, since I can import it with no issue. However, when I try to create a new instance of Tesseract using: Tesseract t = new Tesseract(); I'm getting the following error: Exception in thread "main" java.lang.NoClassDefFoundError: com

Link tesseract libs with QtCreator

阅读更多关于 Link tesseract libs with QtCreator

问题 I'm trying to run a C++ program which is based on tesseract API and I'm using QtCreator as IDE on Ubuntu, in order to perfom page layout analysis : int main(void) { int left, top, right, bottom; tesseract::TessBaseAPI tessApi; tessApi.InitForAnalysePage(); cv::Mat img = cv::imread("document.png"); tessApi.SetImage(reinterpret_cast<const uchar*>(img.data), img.size().width, img.size().height, img.channels(), img.step1()); tesseract::PageIterator *iter = tessApi.AnalyseLayout(); while (iter-

Can't Compile Tesseract API example for WIndows using Tesseract 3.0.2.02 archive

阅读更多关于 Can't Compile Tesseract API example for WIndows using Tesseract 3.0.2.02 archive

问题 I'm looking at using Tesseract to do some work with PDF files, and so I want to use the library rather than an external executable. I started by downloading the full Tesseract source and looking at building that. Sadly the standard sources don't have any means to build on a non-Linux platform, in my case Windows. There are methods for doing so, and I looked at those. Firstly the VS2008 build doesn't. I'm aware that it need Leptonica, but I figured I'd tackle that afterwards and just tried to

Strange Error When Using Tesseract in VB.net

阅读更多关于 Strange Error When Using Tesseract in VB.net

问题 I have the current code: Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click Dim Bitmap As New Bitmap("image.png") Dim ocr As tessnet2.Tesseract = New tessnet2.Tesseract() ocr.SetVariable("tessedit_char_whitelit", "0123456789") ocr.Init("c:\", "fra", False) Dim result As List(Of tessnet2.Word) = ocr.DoOCR(Bitmap, Rectangle.Empty) For Each word As tessnet2.Word In result RichTextBox1.Text &= word.Text & "(" & word.Confidence & ") " Next

How to reduce size of tessdata used for TessBaseAPI in android?

阅读更多关于 How to reduce size of tessdata used for TessBaseAPI in android?

问题 I have an android application where I am using Tesseract OCR i.e the TessBaseAPI. This requires tessdata which is 21mb file. My final app release APK comes to approx 19 mb which I find quite a lot. Is there any way by which I can reduce the size of tessdata or my app or anything else which will help me reduce the final APK size? 回答1: You can use the 3.01 version of .trainddata files -- they have much smaller size -- which are still compatible with newer versions of Tesseract. 来源： https:/