pytesseract don't work with one digit image

Deadly 提交于 2020-01-01 09:38:06

问题


I have code using pytesseract and work perfect, only don't work when the image I try to recognize are 0 to 9. If image only have one digit don't give any result.

This a sample of image I'm working https://drive.google.com/folderview?id=0B68PDhV5SW8BdFdWYVRwODBVZk0&usp=sharing

And this the code I'm using

    import pytesseract
    varnum= pytesseract.image_to_string(Image.open('images/table/img.jpg'))
    varnum = float(varnum)
    print varnum    

Thanks!!!!

With this code I'm able to read all numbers

import pytesseract


start_time = time.clock()
y = pytesseract.image_to_string(Image.open('images/table/1.jpg'),config='-psm 10000')
x = pytesseract.image_to_string(Image.open('images/table/1.jpg'),config='-psm 10000')

print y
print x

y = pytesseract.image_to_string(Image.open('images/table/68.5.jpg'),config='-psm 10000')
x = pytesseract.image_to_string(Image.open('images/table/68.5.jpg'),config='-psm 10000')

print y
print x

print time.clock() - start_time, "seconds" 

result

>>> 
1
1
68.5
68.5
0.485644155358 seconds
>>> 

回答1:


You would need to set the Page Segmentation mode to be able to read single character/digits.

From the tesseract-ocr manual (which is what pytesseract internally uses), you can set the page segmentation mode using -

-psm N

Set Tesseract to only run a subset of layout analysis and assume a certain form of image. The options for N are:

10 = Treat the image as a single character.

So you should set the -psm option to 10. Example -

varnum= pytesseract.image_to_string(Image.open('images/table/img.jpg'),config='-psm 10')


来源:https://stackoverflow.com/questions/31643216/pytesseract-dont-work-with-one-digit-image

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!