Tesseract - Entire line output

不打扰是莪最后的温柔 提交于 2019-12-12 05:24:27

问题


I am trying to OCR few tables using Tesseract. These tables have following format:

Item One name                       Item One category
(Item description if any)

Item Two name                       Item Two category
(Item description if any)

There is some space between the name and category. The output produced is like this

Item One name
(Item description if any)

Item Two name
(Item description if any)


Item One category

Item Two category

Is there a way that I can produce output for the entire line and not get this column wise output one below the other?

I am running Tesseract through simple command line:

tesseract ~/Desktop/imagename.jpg out

回答1:


Try with a different page segmentation mode (PSM), such as 4 or 6.



来源:https://stackoverflow.com/questions/22687127/tesseract-entire-line-output

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!