Getting the bounding box of the recognized words using python-tesseract

前端 未结 7 1173
不思量自难忘°
不思量自难忘° 2020-11-30 19:21

I am using python-tesseract to extract words from an image. This is a python wrapper for tesseract which is an OCR code.

I am using the following code for getting th

7条回答
  •  无人及你
    2020-11-30 20:15

    To get bounding boxes over words:

    import cv2
    import pytesseract
    img = cv2.imread('/home/gautam/Desktop/python/ocr/SEAGATE/SEAGATE-01.jpg')
    
    from pytesseract import Output
    d = pytesseract.image_to_data(img, output_type=Output.DICT)
    n_boxes = len(d['level'])
    for i in range(n_boxes):
        if(d['text'][i] != ""):
            (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
            cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)
    
    cv2.imwrite('result.png', img)
    

提交回复
热议问题