How to get character wise confidence in tesseract using command line?

前端 未结 1 2020
深忆病人
深忆病人 2020-12-19 11:48

I am able to get word level confidence score using tesseract 4.0 through the command line. Interested to know if there is a way to get the character confidence too.

相关标签:
1条回答
  • 2020-12-19 12:15

    Set hocr_char_boxes to 1 in your config file. Or, at the command line, your updated command would be:

    tesseract [Image name] outputbase --oem 1 -l eng --psm 8 -c hocr_char_boxes=1 hocr
    

    Note the hocr output option and look in that file for ..._wconf, e.g.

     <span class='ocrx_word' id='word_1_1' title='bbox 127 344 4618 6915; x_wconf 1'>
    

    Let me know if this works for you, otherwise I'll just delete the answer.

    Source: https://github.com/tesseract-ocr/tesseract/issues/1465#issuecomment-513139976

    0 讨论(0)
提交回复
热议问题