I have used the SIFT implementation of Andrea Vedaldi, to calculate the sift descriptors of two similar images (the second image is actually a zoomed in picture of the same
Try to compare each descriptor from the first image with descriptors from the second one situated in a close vicinity (using the Euclidean distance). Thus, you assign a score to each descriptor from the first image based on the degree of similarity between it and the most similar neighbor descriptor from the second image. A statistical measure (sum, mean, dispersion, mean error, etc) of all these scores gives you an estimate of how similar the images are. Experiment with different combinations of vicinity size and statistical measure to give you the best answer.