What preprocessing operations are performed by Tesseract OCR?
问题 I couldn't find a detailed documentation and I don't feel browsing the source code. I want not to redo canny edge detection for example if it is already done by Tesseract engine. 回答1: This document provides an overview of the engine: https://github.com/tesseract-ocr/docs/blob/master/tesseracticdar2007.pdf So it looks like you don't need to implement canny edge detection. Tesseract uses Otsu thresholding to binarize the image before processing it https://github.com/tesseract-ocr/tesseract/blob