Disable dictionary in Tesseract

前端 未结 1 1287
Happy的楠姐
Happy的楠姐 2020-12-09 06:49

How can I disable dictionary corrections when running Tesseract for English language?

I\'m currently running tesseract as a child process.

相关标签:
1条回答
  • 2020-12-09 07:27

    Try to set these variables (put them in a config file) to false:

    load_system_dawg 
    load_freq_dawg
    load_punc_dawg
    load_number_dawg
    load_unambig_dawg
    load_bigram_dawg
    load_fixed_length_dawgs
    

    https://groups.google.com/forum/?fromgroups=#!searchin/tesseract-ocr/Disable$20dictionary$20in$20Tesseract/tesseract-ocr/5nvIo1DJxHE/f3gBi2pTKykJ

    Also read How to increase the trust in/strength of the dictionary? in the FAQ. From it:

    For tesseract-ocr < 3.01 try upping NON_WERD and GARBAGE_STRING in dict/permute.cpp to maybe 3 or even 5.

    For tesseract-ocr >= 3.01 try increasing the variables language_model_penalty_non_freq_dict_word and language_model_penalty_non_dict_word in a config file. By default they are 0.1 and 0.15 respectively.

    0 讨论(0)
提交回复
热议问题