How to find parameters supported in Tesseract OCR config file

柔情痞子 提交于 2021-01-20 17:52:04

问题


I want to know what parameters the config file used by Tesseract OCR accepts, how to write a config file, etc.

I can't find any documentation about this on their site. How can I determine what parameters are supported, and what they mean?


回答1:


I found these instructions in the link below. They are about writing the config file and where to place it:

config file is simple text file without BOM and with Unix end-of-line mark (on Windows you can use some advanced text editor e.g. Notepad++ to achieve this).

If you use tesseract executable this is only way how to change tesseract parameters.

config file should be located in your tessdata/configs directory. Have a look there for some examples.

There is a list of all the variables plus descriptions of each one in http://www.sk-spell.sk.cx/tesseract-ocr-parameters-in-302-version. Note it's for Tesseract 3.02, things may be different in other versions.

Edit: Also adding a pastebin link in case the above link becomes dead.




回答2:


Tesseract v3.04 now offers the command line option --print-parameters, so you can call tesseract --print-parameters to get a list of the 678 (!) configurable parameters, their default values, and a short description:

Tesseract parameters:
editor_image_xpos   590 Editor image X Pos
editor_image_ypos   10  Editor image Y Pos
editor_image_menuheight 50  Add to image height for menu bar
editor_image_word_bb_color  7   Word bounding box colour
editor_image_blob_bb_color  4   Blob bounding box colour
editor_image_text_color 2   Correct text colour
...and many, many more



回答3:


It's just a plain text file containing space-delimited key/value pairs for Tesseract config variables, each on separate line; for instance:

interactive_display_mode T
tessedit_display_outwords T

There are several standard config files -- such as digits, hocr -- under Tesseract tessdata/configs folder.



来源:https://stackoverflow.com/questions/13007245/how-to-find-parameters-supported-in-tesseract-ocr-config-file

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!