I\'m trying to get Tesseract to output a file with labelled bounding boxes that result from page segmentation (pre OCR). I know it must be capable of doing this \'out of the
If you are python familiar, you can directly use tesserocr
library which is a nice python wrapper around the C++ API. Here is a code snippet to draw polygons at block level using PIL:
from PIL import Image, ImageDraw
from tesserocr import PyTessBaseAPI, RIL, iterate_level, PSM
img = Image.open(filename)
results = []
with PyTessBaseAPI() as api:
api.SetImage(img)
api.SetPageSegMode(PSM.AUTO_ONLY)
iterator = api.AnalyseLayout()
for w in iterate_level(iterator, RIL.BLOCK):
if w is not None:
results.append((w.BlockType(), w.BlockPolygon()))
print('Found {} block elements.'.format(len(results)))
draw = ImageDraw.Draw(img)
for block_type, poly in results:
# you can define a color per block type (see tesserocr.PT for block types list)
draw.line(poly + [poly[0]], fill=(0, 255, 0), width=2)