Page layout analysis using Tesseract?

后端未结

关注

 4  1858

一个人的身影 2020-12-23 10:40

Tesseract 3 is able to perform page layout analysis. However, I couldn\'t find any sample code or documentation on how to use the library for such purposes. I hope someone h

4条回答

佛祖请我去吃肉 (楼主)

2020-12-23 11:34
Tesseract can be given a page mode parameter (-psm) which can have the following values:
- 0 = Orientation and script detection (OSD) only.
- 1 = Automatic page segmentation with OSD.
- 2 = Automatic page segmentation, but no OSD, or OCR
- 3 = Fully automatic page segmentation, but no OSD. (Default)
- 4 = Assume a single column of text of variable sizes.
- 5 = Assume a single uniform block of vertically aligned text.
- 6 = Assume a single uniform block of text.
- 7 = Treat the image as a single text line.
- 8 = Treat the image as a single word.
- 9 = Treat the image as a single word in a circle.
- 10 = Treat the image as a single character.
Example:
```
tesseract image.tif image.txt -l eng -psm 0
```
However, I am not sure that it is possible to use the layout analysis in standalone mode.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...