Image to Text - Pytesseract struggles with digits on windows

岁酱吖の 提交于 2021-02-11 12:03:28


I'm trying to preprocess frames of a game in real-time for a ML project. I want to extract numbers from the frame, so I chose Pytesseract, since it looked quite good with text. Though, no matter how clear I make the text, it won't read it correctly. My code looks like this:

section = process_screen(screen_image)[1]
pixels = rgb_to_bw(section) #Makes the image grayscale
pixels[pixels < 200] = 0 #Makes all non-white pixels black

=> 'ye ml)'

At best it outputs "ye ml)" when I don't specify I want digits, and when I do, it outputs nothing at all.

The non-processed game image looks like so:

The "pixels" image looks like so :

Thanks to Alex Alex, I inverted the image, and got this

And got "2710", which is better, but still not perfect.


You must invert the image before recognition.

