How to get character wise confidence in tesseract using command line?

烈酒焚心 提交于 2019-11-29 11:51:21

Set hocr_char_boxes to 1 in your config file. Or, at the command line, your updated command would be:

tesseract [Image name] outputbase --oem 1 -l eng --psm 8 -c hocr_char_boxes=1 hocr

Note the hocr output option and look in that file for ..._wconf, e.g.

 <span class='ocrx_word' id='word_1_1' title='bbox 127 344 4618 6915; x_wconf 1'>

Let me know if this works for you, otherwise I'll just delete the answer.

Source: https://github.com/tesseract-ocr/tesseract/issues/1465#issuecomment-513139976

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!