Page layout analysis using Tesseract?

后端 未结 4 1855
一个人的身影
一个人的身影 2020-12-23 10:40

Tesseract 3 is able to perform page layout analysis. However, I couldn\'t find any sample code or documentation on how to use the library for such purposes. I hope someone h

4条回答
  •  一整个雨季
    2020-12-23 11:16

    Not sure if this exactly answers your question, but I landed here looking for ways to get the bbox-coordinates info (and text recognised inside the bbox optionally) given an input image. The solution to which is now possible using tesseract.

    $> tesseract test.tiff test.txt -l eng -psm 1 tsv
    

    The params to notice in above code-snippet are 'psm' and 'tsv'. 'psm' selects the page segmentation mode and 'tsv' generates a nice tabular output file with all the information (page-block-line number, bbox coods, confidence, predicted text) you'd need on your text-image (shown below)

    level   page_num    block_num   par_num line_num    word_num    left    top width   height  conf    text
    1   1   0   0   0   0   0   0   5500    4250    -1
    2   1   1   0   0   0   327 285 2218    53  -1
    3   1   1   1   0   0   327 285 2218    53  -1
    4   1   1   1   1   0   327 285 2218    53  -1
    5   1   1   1   1   1   327 285 246 38  87  INFOPAC
    5   1   1   1   1   2   620 287 165 38  87  PAGE
    5   1   1   1   1   3   952 290 100 37  95  NAME
    5   1   1   1   1   4   1173    292 1082    45  39  ENTRYDATE
    5   1   1   1   1   5   2333    302 212 36  48  EMAIL
    

提交回复
热议问题