OCR - how to get text from outlined words

一曲冷凌霜 提交于 2019-12-22 08:28:07

问题


I have an image of text, where the words are outlined rather than filled in. Tesseract is struggling to get any of the words correct - does anyone have a solution to these types of problems?

I have tried simple operations like inversion, but to no affect. I'm guessing tesseract already handles this.

Img example:
Typical output for Next: New
Typical output for Previous: Pflevuows

(my very simple) Code, takes the image as an argument:

import pytesseract
import sys
from PIL import Image

print(pytesseract.image_to_string(Image.open(sys.argv[1])))
print(sys.argv[1])

EDIT: Applying a threshold binary can get me next, but does not seem to get previous still.

来源:https://stackoverflow.com/questions/38000761/ocr-how-to-get-text-from-outlined-words

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!