Pytesseract dont reconize a very clear image

余生颓废 提交于 2019-12-11 17:23:00

问题


I have aplied pytesseract in Three similar images of the digit "2". Only in the last one, pytesseract reconize correctly the digit. The three images have diferent dimensions and if i change the dimension of the images in the right way, pytesseract correctly reconize them. But i dont understand how a powerful ocr like tesseract is not working well in a so easy and clear image.

first image, fail in recognize

second image, also fail

third image, sucessful

im using python 3.7 with anaconda, tesseract v4.0.0.20181030 leptonica-1.76.0 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.2.0


回答1:


You can find extensive documentation on how to train tesseract-ocr here.

The only tricky part of training tesseract is the box files, I recommend you use:

Tesseract-OCR Chopper

to generate boxfiles for training.



来源:https://stackoverflow.com/questions/54394447/pytesseract-dont-reconize-a-very-clear-image

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!