python-tesseract

OSError: [Errno 2] No such file or directory using pytesser

独自空忆成欢 提交于 2019-11-27 16:47:14
问题 This is my problem, I want to use pytesser to get a picture's contents. My operating system is Mac OS 10.11, and I have already installed PIL, pytesser, tesseract-ocr engine, and other supporting libraries like libpng and so on. But when I run my code, as below, error occurs. from pytesser import * import os image = Image.open('/Users/Grant/Desktop/1.png') text = image_to_string(image) print text Next is the error message Traceback (most recent call last): File "/Users/Grant/Documents

pytesseract cannot find the file specified

二次信任 提交于 2019-11-27 13:36:55
My code is straight forward and is the following: import pytesseract from PIL import Image img = Image.open('C:/temp/foo.jpg') img.load() i = pytesseract.image_to_string(img) and the error response I get back is: Traceback (most recent call last): File "img.py", line 6, in <module> i = pytesseract.image_to_string(img) File "build\bdist.win32\egg\pytesseract\pytesseract.py", line 161, in image_to _string File "build\bdist.win32\egg\pytesseract\pytesseract.py", line 94, in run_tesse ract File "C:\Users\%USER%\AppData\Local\Continuum\Anaconda\lib\subprocess.py", line 710, in __init__ errread,

Getting the bounding box of the recognized words using python-tesseract

拥有回忆 提交于 2019-11-27 11:08:34
I am using python-tesseract to extract words from an image. This is a python wrapper for tesseract which is an OCR code. I am using the following code for getting the words: import tesseract api = tesseract.TessBaseAPI() api.Init(".","eng",tesseract.OEM_DEFAULT) api.SetVariable("tessedit_char_whitelist", "0123456789abcdefghijklmnopqrstuvwxyz") api.SetPageSegMode(tesseract.PSM_AUTO) mImgFile = "test.jpg" mBuffer=open(mImgFile,"rb").read() result = tesseract.ProcessPagesBuffer(mBuffer,len(mBuffer),api) print "result(ProcessPagesBuffer)=",result This returns only the words and not their location

Tesseract Not Found Error

时光总嘲笑我的痴心妄想 提交于 2019-11-27 03:38:46
I am trying to use pytesseract in python but always end up with the error: "TesseractNotFoundError: tesseract is not installed or it's not in your path" pytesseract and tesseract are installed in system. I am new to python so i will really appreciate if somebody can help me with this Ben Hooper I tried adding to the path variable like others have mentioned, but still received the same error. what worked was adding this to my script: pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe" I got this error because I installed pytesseract with pip but forget

Detect text region in image using Opencv

谁说我不能喝 提交于 2019-11-26 18:54:36
问题 I have an image and want to detect the text regions in it. I tried TiRG_RAW_20110219 project but the results are not satisfactory. If the input image is http://imgur.com/yCxOvQS,GD38rCa it is producing http://imgur.com/yCxOvQS,GD38rCa#1 as output. Can anyone suggest some alternative. I wanted this to improve the output of tesseract by sending it only the text region as input. 回答1: import cv2 def captch_ex(file_name): img = cv2.imread(file_name) img_final = cv2.imread(file_name) img2gray = cv2

pytesseract cannot find the file specified

泪湿孤枕 提交于 2019-11-26 18:16:19
问题 My code is straight forward and is the following: import pytesseract from PIL import Image img = Image.open('C:/temp/foo.jpg') img.load() i = pytesseract.image_to_string(img) and the error response I get back is: Traceback (most recent call last): File "img.py", line 6, in <module> i = pytesseract.image_to_string(img) File "build\bdist.win32\egg\pytesseract\pytesseract.py", line 161, in image_to _string File "build\bdist.win32\egg\pytesseract\pytesseract.py", line 94, in run_tesse ract File

Getting the bounding box of the recognized words using python-tesseract

只谈情不闲聊 提交于 2019-11-26 15:21:01
问题 I am using python-tesseract to extract words from an image. This is a python wrapper for tesseract which is an OCR code. I am using the following code for getting the words: import tesseract api = tesseract.TessBaseAPI() api.Init(".","eng",tesseract.OEM_DEFAULT) api.SetVariable("tessedit_char_whitelist", "0123456789abcdefghijklmnopqrstuvwxyz") api.SetPageSegMode(tesseract.PSM_AUTO) mImgFile = "test.jpg" mBuffer=open(mImgFile,"rb").read() result = tesseract.ProcessPagesBuffer(mBuffer,len