identify clear text from image python

不想你离开。 提交于 2019-11-30 09:59:53

问题


i used pytesseract to identify text from image

pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

then i used below code to identify text

textImg = pytesseract.image_to_string(Image.open(imgLoc+"/"+imgName))

print(textImg)
text_file = open(imgLoc+"/"+"oriText.txt", "w")
text_file.write(textImg)
text_file.close()

this is my input image

this is an image of my output text file

is there any way to identify the text clearly from image


回答1:


Your can try improving the results by shortening the character set, and only allowing characters that are legal in your particular language (exclude numbers, special characters etc) . This Answer will help.

Tesseract OCR isn't the best at figuring out characters in a image. Your can try processing the image a bit, in order to improve the results. This will help

  • Make sure the image dpi/ppi is above 250 otherwise the results may be inaccurate.

I generally prefer this website www.onlineocr.net for doing Optical Character Recognition as the results are almost perfect each time. Your can try using their own API, for doing character recognition (requires internet connectivity to be functional). The Results obtained by using this API, are far superior then from tesseract OCR. So you may give it a try.



来源:https://stackoverflow.com/questions/56303292/identify-clear-text-from-image-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!