How do I segment a document using Tesseract then output the resulting bounding boxes and labels

后端未结

关注

 6  2046

忘了有多久 2020-12-07 10:25

I\'m trying to get Tesseract to output a file with labelled bounding boxes that result from page segmentation (pre OCR). I know it must be capable of doing this \'out of the

6条回答

北海茫月 (楼主)

2020-12-07 11:02

If you are python familiar, you can directly use tesserocr library which is a nice python wrapper around the C++ API. Here is a code snippet to draw polygons at block level using PIL:

from PIL import Image, ImageDraw
from tesserocr import PyTessBaseAPI, RIL, iterate_level, PSM

img = Image.open(filename)

results = []
with PyTessBaseAPI() as api:
    api.SetImage(img)
    api.SetPageSegMode(PSM.AUTO_ONLY)
    iterator = api.AnalyseLayout()
    for w in iterate_level(iterator, RIL.BLOCK):
        if w is not None:
            results.append((w.BlockType(), w.BlockPolygon()))
print('Found {} block elements.'.format(len(results)))

draw = ImageDraw.Draw(img)
for block_type, poly in results:
    # you can define a color per block type (see tesserocr.PT for block types list)
    draw.line(poly + [poly[0]], fill=(0, 255, 0), width=2)

0 讨论(0)

查看其它6个回答