问题
I have folder with images. Some images have duplicates or similar (images of the same scene from another angle) or modifications(images which differ by size, blur level or noise filters). My task is to define if some of these images have similar images
I find this code, but I can't understand how output number describes similarity of two images when of one of them is modified or the same scene from another angle.
def compare(file1, file2):
im = [None, None] # to hold two arrays
for i, f in enumerate([file1, file2]):
im[i] = (np.array(
Image.open('C:/Users/taras/Downloads/dev_dataset/dev_dataset/'+f+'.jpg')
.convert('L') # convert to grayscale using PIL
.resize((32,32), resample=Image.BICUBIC)) # reduce size and smooth a bit using PIL
).astype(np.int) # convert from unsigned bytes to signed int using numpy
return np.abs(im[0] - im[1]).sum()
回答1:
The code converts the image to greyscale and resizes it to 32x32 pixels. That means all details are lost and you just get a general idea of the colour/brightness at 1024 points regardless of the shape or size of the original image.
It then does that for the second image too and it then has 1024 brightnesses for each image. It works out the absolute difference between each pair of brightnesses by subtraction and then totals all the differences up.
If the images are identical, the differences will be zero and the result will be low. If the images are very different, they will have different brightnesses in each area and adding up those differences will come to a large number.
It is like a "Perceptual Hash" if you feel like Googling.
Here is Mr Bean and an 8x8 grey version - think of it as a vector of 64 numbers:
Here are the numbers:
255 253 255 207 124 255 254 255 255 252 255 178 67 245 255 254 255 255 255 193 154 255 255 255 255 249 183 142 192 253 251 255 255 216 92 180 156 215 254 255 255 181 96 179 115 194 255 254 255 153 95 175 92 102 246 255 255 112 98 163 97 50 195 255
Here is Paddington and an 8x8 grey version - he too is now just 64 numbers:
Here are the numbers:
247 244 166 123 114 65 0 2 223 235 163 65 30 48 20 0 218 197 59 61 110 37 32 0 140 67 14 149 183 65 7 2 57 25 64 175 169 69 0 2 51 29 57 131 112 31 3 0 60 63 59 38 14 51 32 0 59 87 61 13 11 53 46 0
Then the maths is easy:
abs(255-247) + abs(253-244) + abs(255-166) ...
来源:https://stackoverflow.com/questions/56188722/how-can-i-define-if-two-images-are-similar