python captcha decoder library

微笑、不失礼 提交于 2019-12-28 12:45:32

问题


I need a Captcha decoder for python to read simple image captchas like the following picture:

Do you know of a library that can help me read this captcha?

If you don't know of a library for reading captchas, could you help me to read this (and others like this) with PIL?


回答1:


I hope this captcha is not used anywhere.

Following is a dummy way to decode it. Basically what you need are the patterns from 0 to 9 as present in these captchas. From your examples, I have only the patterns for 0 3 4 5 7 8. Since everything is fixed on them, you know where to split each character. You also know each character is a number of fixed size and fixed font. If it also includes letters or more characters, but of fixed size and font, then the following code can be easily adapted.

What the code does is: a) load the patterns (I considered they are named n0.png, n1.png, ...); b) split the captcha in NUMS pieces; c) do a sum of squared differences between each pattern and each split number; d) decide that the the split number is the one with the smallest sum. It returns a list for each number, in order, present in the captcha. To obtain the initial patterns, you can uncomment the lines that save the split numbers, place a return after that piece, and adjust the file names.

import sys
from PIL import Image, ImageOps

PAT_SIZE = (8, 10)
NUMS = 3
FIRST_NUM_OFFSET = 5
NUM_OFFSET = (1, 3)


NUMBERS = []
for i in xrange(10):
    try:
        NUMBERS.append(Image.open('n%d.png' % i).load())
    except IOError:
        print "I do not know the pattern for the number %d." % i
        NUMBERS.append(None)


def magic(fname):
    captcha = ImageOps.grayscale(Image.open(fname))
    im = captcha.load()

    # Split numbers
    num = []
    for n in xrange(NUMS):
        x1, y1 = (FIRST_NUM_OFFSET + n * (NUM_OFFSET[0] + PAT_SIZE[0]),
                NUM_OFFSET[1])
        num.append(captcha.crop((x1, y1, x1 + PAT_SIZE[0], y1 + PAT_SIZE[1])))

    # If you want to save the split numbers:
    #for i, n in enumerate(num):
    #    n.save('%d.png' % i)

    def sqdiff(a, b):
        if None in (a, b): # XXX This is here just to handle missing pattern.
            return float('inf')

        d = 0
        for x in xrange(PAT_SIZE[0]):
            for y in xrange(PAT_SIZE[1]):
                d += (a[x, y] - b[x, y]) ** 2
        return d

    # Calculate a dummy sum of squared differences between the patterns
    # and each number. We assume the smallest diff is the number in the
    # "captcha".
    result = []
    for n in num:
        n_sqdiff = [(sqdiff(p, n.load()), i) for i, p in enumerate(NUMBERS)]
        result.append(min(n_sqdiff)[1])
    return result

print magic(sys.argv[1])



回答2:


It is a nice project to do for academic reasons, I was interested in this a while ago. You have a few options:

  1. You write your own with the help from this site: http://www.wausita.com/captcha/

  2. You use OpenCV to do the matching.

If think there was a dedicated libary for neural network image matching but i can't seem to find it.

Basically as the others said, you want to remove the noise, split into single chars and compare it using a chosen technique to the model chars.




回答3:


I hope you are using it in good faith and you are not going to harm (/spam) anyone.

I won't write you the script nor forward you to an external plugin. But incase you are writing this by your own, this may help:

  • In case you are trying to decode a specific captcha pattern you should collect all chars (I saw from the examples you attached that it's only numbers so it shouldn't be alot of work).
  • Put all of the chars in one file and analyze it with PIL
  • Save in an array each char, its position and its meaning.
  • Get a Captcha image - Clear the background noise if necessary.
  • Split the Captcha image to char-sized and cross it through your self-made dictionary of chars.


来源:https://stackoverflow.com/questions/13664161/python-captcha-decoder-library

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!