This is still a research area, I believe. If you have some time in your hands, some relevant keywords are:
- Image copy detection
- Content based image retrieval
- Image indexing
- Image duplicate removal
Basically, each image is processed (indexed) to produce an "image signature". Similar images have similar signatures. If your images are just rescaled then probably their signature are nearly identical, so they cluster well. Some popular signatures are the MPEG-7 descriptors. To cluster, I think K-Means or any of its variants may be enough.
However, you probably need to deal with millions of images, this may be a problem.
Here is a link to the main Wikipedia entry:
http://en.wikipedia.org/wiki/CBIR
Hope this helps.