Getting the bounding box of the recognized words using python-tesseract

前端 未结 7 1153
不思量自难忘°
不思量自难忘° 2020-11-30 19:21

I am using python-tesseract to extract words from an image. This is a python wrapper for tesseract which is an OCR code.

I am using the following code for getting th

7条回答
  •  天命终不由人
    2020-11-30 19:54

    Some examples are answered aove which can be used with pytesseract, however to use tesserocr python library you can use code given below to find individual word and their bounding boxes:-

        with PyTessBaseAPI(psm=6, oem=1) as api:
                level = RIL.WORD
                api.SetImageFile(imagePath)
                api.Recognize()
                ri = api.GetIterator()
                while(ri.Next(level)):
                    word = ri.GetUTF8Text(level)
                    boxes = ri.BoundingBox(level)
                    print(word,"word")
                    print(boxes,"coords")
    

提交回复
热议问题