The most important part is to make the files comparable.
A generic solution might be to scale all images to a certain fixed size and greyscale. Then save the resulting images in a separate directory with same name for later reference. It would then be possible to sort by filesize and visually compare neighboring entries.
The resulting pictures might be quantified in certain ways to programatically detect similarities (averaging of blocks, lines etc.).