Tesseract or any other OCR lib

a 夏天 提交于 2019-12-09 11:41:23

问题


I'm looking for an explanation / API doc / examples of how to use (and train?) Tesseract in C++, nothing useful on the google Tesseract page, and yet to find something over the web.

Anyone useful sources, experiences would be more than welcome, as I have no idea how to begin with it.

P.S:

  1. I'm open for suggestions on other libraries.
  2. Only FREE libraries

回答1:


I have some experience with Tesseract... a simple google of 'training tesseract' reveals this page: http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract where you must choose which version of tesseract you wish to train.. While 3 is the latest version, it's brand new and thus people are still ironing out any issues - im still using version 2.4. Anyways, you'll see there are about 9 steps in training tesseract for a particular 'language' (or what should have been called 'fonts' or 'character-sets'). You could also just use the existing 'eng' language - but it depends on your application. For example, in my application I would have to do the document analysis and take a particular region and want to OCR a 13-character string of numbers - and I needed high accuracy - and I didn't want it reading '5' as 'S' and '0' as 'O' etc, so it was logical to create a particular 'language' of my particular font-set consisting only of the characters 0..9, whereas you might not care if you get extra 'noise




回答2:


Tesseract Ocr is an open source library for detecting Optical Character. You just need to include the library files if you are using visual studio. If you are using qt creator then you have to build the library to work on the QT. You need to use CMakelist or Cmake Gui to build the library. You can visit the link Opencv Ocr build for Qt 5.4 mingw



来源:https://stackoverflow.com/questions/4314060/tesseract-or-any-other-ocr-lib

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!