Python OCR Module in Linux?

后端 未结 5 1222
暗喜
暗喜 2020-12-24 00:04

I want to find a easy-to-use OCR python module in linux, I have found pytesser http://code.google.com/p/pytesser/, but it contains a .exe executable file.

I tried ch

5条回答
  •  渐次进展
    2020-12-24 00:34

    You can just wrap tesseract in a function:

    import os
    import tempfile
    import subprocess
    
    def ocr(path):
        temp = tempfile.NamedTemporaryFile(delete=False)
    
        process = subprocess.Popen(['tesseract', path, temp.name], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
        process.communicate()
    
        with open(temp.name + '.txt', 'r') as handle:
            contents = handle.read()
    
        os.remove(temp.name + '.txt')
        os.remove(temp.name)
    
        return contents
    

    If you want document segmentation and more advanced features, try out OCRopus.

提交回复
热议问题