Simple hash of PIL image

前端 未结 2 1221
醉话见心
醉话见心 2020-12-31 21:41

Background

I want to store information of PIL images in a key-value store. For that, I hash the image and use the hash as a key.

What I tried

I h

2条回答
  •  粉色の甜心
    2020-12-31 22:14

    I'm guessing your goal is to perform image hashing in Python (which is much different than classic hashing, since byte representation of images is dependent on format, resolution and etc.)

    One of the image hashing techniques would be average hashing. Make sure that this is not 100% accurate, but it works fine in most of the cases.


    First we simplify the image by reducing its size and colors, reducing complexity of the image massively contributes to accuracy of comparison between other images:

    Reducing size:

    img = img.resize((10, 10), Image.ANTIALIAS)

    Reducing colors:

    img = img.convert("L")

    Then, we find average pixel value of the image (which is obviously one of the main components of the average hashing):

    pixel_data = list(img.getdata())
    avg_pixel = sum(pixel_data)/len(pixel_data)
    

    Finally hash is computed, we compare each pixel in the image to the average pixel value. If pixel is more than or equal to average pixel then we get 1, else it is 0. Then we convert these bits to base 16 representation:

    bits = "".join(['1' if (px >= avg_pixel) else '0' for px in pixel_data])
    hex_representation = str(hex(int(bits, 2)))[2:][::-1].upper()
    

    If you want to compare this image to other images, you perform actions above, and find similarity between hexadecimal representation of average hashed images. You can use something as simple as hamming distance or more complex algorithms such as Levenshtein distance, Ratcliff/Obershelp pattern recognition (SequenceMatcher), Cosine Similarity etc.

提交回复
热议问题