How do I segment a document using Tesseract then output the resulting bounding boxes and labels

后端未结

关注

 6  2049

忘了有多久 2020-12-07 10:25

I\'m trying to get Tesseract to output a file with labelled bounding boxes that result from page segmentation (pre OCR). I know it must be capable of doing this \'out of the

6条回答

温柔的废话 (楼主)

2020-12-07 11:07

The HOCR individual character step is now available in Tesseract since 4.1. Once the installation check, use :

tesseract {image file} {output name} -c tessedit_create_hocr=1 -c hocr_char_boxes=1

0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...