Changing image DPI for usage with tesseract

折月煮酒 提交于 2021-01-27 05:28:34

问题


I am working on a project to recognize text in Business Cards and map them to appropriate fields.I am using opencv for image processing.I need to feed the preprocessed image to Tesseract-OCR engine for text recognition.This link states that images should have atleast a DPI of 300.My image pixel size is 2560x1536 with 72 DPI.

  • How to increase the DPI to 300?
  • It is also said that it is beneficial to resize image.How to resize my image optimally for good OCR results
  • Tesseract works best on images which have a DPI of at least 300 dpi, so it may be beneficial to resize images. What does 'so' imply here.What is the relation between resizing an image and DPI?

回答1:


For OCR, what really matters is the resolution in pixels. Because the physical characters can range from tiny to huge, independently of the DPI of the acquisition device.

As a rule of thumb, stroke width around 3 pixels is a good start. If lower, resizing might not be helpful because the information is missing. If much higher, the running time might be excessive (or the OCR function not be taylored to deal with it).

Also check that the package will not attempt to resize internally, based on its own assumption of stroke width and the DPI info stored in the header, if there is a mismatch.



来源:https://stackoverflow.com/questions/44095676/changing-image-dpi-for-usage-with-tesseract

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!