Pytesseract random bug when reading text

I'm creating a bot for a video game and I have to read some information displayed on the screen. Given that the information is always at the same position, I have no issue to take a screenshot and crop the picture to the right position.

90% of the time, the recognition will be perfect, but sometimes it will return something that seems totally random (see the example below).

I've tried to turn the picture into black and white with no success, and tried to change the pytesseract config (config = ("-l fra --oem 1 --psm 6"))

def readScreenPart(x,y,w,h):
    monitor = {"top": y, "left": x, "width": w, "height": h}
    output = "monitor.png"
    with mss.mss() as sct:
        sct_img = sct.grab(monitor)        
        mss.tools.to_png(sct_img.rgb, sct_img.size, output=output)

    img = cv2.imread("monitor.png")
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    cv2.imwrite("result.png", img)
    config = ("-l fra --oem 1 --psm 6")

    return pytesseract.image_to_string(img,config=config)

Example : this picture generates a bug, it returns the string "IRPMV/LEIILK"

Another image

Now I don't know where the issue comes from, given that it is not just a single wrong character but a totally random result..

Thanks for your help

As the comment said, it's about your text and background color. Tesseract is basically useless with light text on dark background, here is the few lines i apply to any text image before giving it to tesseract :

# convert color image to grayscale
grayscale_image = cv2.cvtColor(your_image, cv2.COLOR_BGR2GRAY)

# Otsu Tresholding method find perfect treshold, return an image with only black and white pixels
_, binary_image = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU)

# we just don't know if the text is in black and background in white or vice-versa
# so we count how many black pixels and white pixels there are
count_white = numpy.sum(binary > 0)
count_black = numpy.sum(binary == 0)

# if there are more black pixels than whites, then it's the background that is black so we invert the image's color
if count_black > count_white:
    binary_image = 255 - binary_image

black_text_white_background_image = binary_image

Now you're sure to have black text on white background no matter wich colors was the original image, also Tesseract is (weirdly) the most efficient if the characters have an height of 35pixels, larger characters doesn't significantly reduce the accuracy, but just a few pixels shorter can make tesseract useless!

Preprocessing is an important step before throwing the image into Pytesseract. Generally, you want to have the desired text in black with the background in white. Currently, your foreground text is in green instead of white. Here's a simple process to fix the format

Convert image to grayscale
Otsu's threshold to obtain a binary image
Invert image

Original image

Otsu's threshold

Invert image

Output from Pytesseract

122 Vitalité

Other image

200 Vitalité

Before inverting the image, it may be a good idea to perform morphological operations to smooth/filter the text. But for your images, the text does not necessary require additional smoothing

import cv2
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

image = cv2.imread('3.png',0)
thresh = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
result = 255 - thresh

data = pytesseract.image_to_string(result, lang='eng',config='--psm 6')
print(data)

cv2.imshow('thresh', thresh)
cv2.imshow('result', result)
cv2.waitKey()

来源：https://stackoverflow.com/questions/58022929/pytesseract-random-bug-when-reading-text

标签

python

image

OpenCV

ocr

python-tesseract