How do I segment a document using Tesseract then output the resulting bounding boxes and labels

后端 未结 6 2046
忘了有多久
忘了有多久 2020-12-07 10:25

I\'m trying to get Tesseract to output a file with labelled bounding boxes that result from page segmentation (pre OCR). I know it must be capable of doing this \'out of the

6条回答
  •  北海茫月
    2020-12-07 11:02

    If you are python familiar, you can directly use tesserocr library which is a nice python wrapper around the C++ API. Here is a code snippet to draw polygons at block level using PIL:

    from PIL import Image, ImageDraw
    from tesserocr import PyTessBaseAPI, RIL, iterate_level, PSM
    
    img = Image.open(filename)
    
    results = []
    with PyTessBaseAPI() as api:
        api.SetImage(img)
        api.SetPageSegMode(PSM.AUTO_ONLY)
        iterator = api.AnalyseLayout()
        for w in iterate_level(iterator, RIL.BLOCK):
            if w is not None:
                results.append((w.BlockType(), w.BlockPolygon()))
    print('Found {} block elements.'.format(len(results)))
    
    draw = ImageDraw.Draw(img)
    for block_type, poly in results:
        # you can define a color per block type (see tesserocr.PT for block types list)
        draw.line(poly + [poly[0]], fill=(0, 255, 0), width=2)
    

提交回复
热议问题