ocr

Pytesseract is too slow. How can I make it process images faster?

佐手、 提交于 2019-12-24 00:22:26
问题 I am using pytesseract in the below code: def fnd(): for fname in list: x = None x = np.array([np.array(PIL.Image.open(fname))]) print x.size for im in x: txt = pytesseract.image_to_string(image=im).encode('utf-8').strip() open("Output.txt","a+").write(txt) with open("Output.txt") as openfile: for line in openfile: for part in line.split(): if "cyber" in part.lower(): print(line) return The list contains names of images from a folder (2408*3506 & 300 res Gray-scaled). Unfortunately for around

Training Tesseract on Android [closed]

假如想象 提交于 2019-12-23 21:03:52
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 5 months ago . I am using the tess-two library for OCR recognition on Android . I want to create the training data on Android . I have followed this link and successfully created training data on linux system . How to do the same on Android using tess-two or any other library ? 回答1: The tess-two library for Android uses the

Recognize simple digits with pytesser

喜欢而已 提交于 2019-12-23 20:52:54
问题 I'm learning OCR using PyTesser and Tesseract . As the first milestone, I want to write a tool to recognize captcha that simply consists of some digits. I read some tutorials and wrote such a test program. from pytesser.pytesser import * from PIL import Image, ImageFilter, ImageEnhance im = Image.open("test.tiff") im = im.filter(ImageFilter.MedianFilter()) enhancer = ImageEnhance.Contrast(im) im = enhancer.enhance(2) im = im.convert('1') text = image_to_string(im) print "text={}".format(text)

pytesseract error Windows Error [Error 2]

妖精的绣舞 提交于 2019-12-23 20:41:51
问题 Hi I am trying the python library pytesseract to extract text from image. Please find the code: from PIL import Image from pytesseract import image_to_string print image_to_string(Image.open(r'D:\new_folder\img.png')) But the following error came: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Python27\lib\site-packages\pytesseract\pytesseract.py", line 161, in image_to_string config=config) File "C:\Python27\lib\site-packages\pytesseract\pytesseract.py", line

How to improve text extraction from an image?

我的梦境 提交于 2019-12-23 17:35:52
问题 I am using pytesseract to extract text from images. Before extracting text with pytesseract, I use Pillow and cv2 to reduce noise and enhance the image: import numpy as np import pytesseract from PIL import Image, ImageFilter, ImageEnhance import cv2 img = cv2.imread('ss.png') img = cv2.resize(img, (0,0), fx=3, fy=3) cv2.imwrite("new.png", img) img1 = cv2.imread("new.png", 0) #Apply dilation and erosion kernel = np.ones((2, 2), np.uint8) img1 = cv2.dilate(img1, kernel, iterations=1) img1 =

Python, text detection OCR

落爺英雄遲暮 提交于 2019-12-23 17:01:11
问题 I am trying to extract data from a scanned form. The form has a standard format similar to the one shown in the image below: I have tried using pytesseract (tesseract OCR) to detect the image's text and it has done a decent job at finding the text and converting the image to text. However it essentially just gives me all the detected text without keeping the format of the data. I would like to be able to do something like the below: Find a particular piece of text and then find the associated

Using Zxing and Google Goggles with my app

不打扰是莪最后的温柔 提交于 2019-12-23 15:37:29
问题 I have an app, i used this code to integrate zxing public Button.OnClickListener mScan = new Button.OnClickListener() { public void onClick(View v) { Intent intent = new Intent("com.google.zxing.client.android.SCAN"); intent.putExtra("SCAN_MODE", "QR_CODE_MODE"); startActivityForResult(intent, 0); } }; public void onActivityResult(int requestCode, int resultCode, Intent intent) { if (requestCode == 0) { if (resultCode == RESULT_OK) { String contents = intent.getStringExtra("SCAN_RESULT");

Scanning Text (OCR) in Windows Phone 7.5 [closed]

喜你入骨 提交于 2019-12-23 15:25:59
问题 As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance. Closed 6 years ago . Is there a way to force Zxing library to scan text only? I am looking for the offline (non-cloud) solution to scan text in windows

Android : How to capture text from camera without taking picture?

寵の児 提交于 2019-12-23 12:34:10
问题 I want to capture texts and numbers that showing with camera without taking picture using tess-two(in android and eclipse). I dont want to save image file. something like this (it is capturing live on camera): I have used tess-two , but i have to take picture first and then capture text. (using link : https://stackoverflow.com/questions/19533273/best-ocr-optical-character-recognition-example-in-android) and I have used this (https://www.codeproject.com/Articles/840623/Android-Character

Convert image to searchable pdf [closed]

自古美人都是妖i 提交于 2019-12-23 10:23:53
问题 As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance. Closed 7 years ago . Hi I am looking for a open-source java API that can convert tiff image to searchable pdf (OCR). I have research around but found