OCR - how to get text from outlined words

问题

I have an image of text, where the words are outlined rather than filled in. Tesseract is struggling to get any of the words correct - does anyone have a solution to these types of problems?

I have tried simple operations like inversion, but to no affect. I'm guessing tesseract already handles this.

Img example:
Typical output for Next: New
Typical output for Previous: Pﬂevuows

(my very simple) Code, takes the image as an argument:

import pytesseract
import sys
from PIL import Image

print(pytesseract.image_to_string(Image.open(sys.argv[1])))
print(sys.argv[1])

EDIT: Applying a threshold binary can get me next, but does not seem to get previous still.

来源：https://stackoverflow.com/questions/38000761/ocr-how-to-get-text-from-outlined-words

标签

python

OpenCV

python-imaging-library

ocr

tesseract

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!