问题
At the moment I've got an database with over 100.000 images, they ain't the same size or anything like that but I want to make the following for my compagny:
I insert/upload an image and the system returns the image which is most likely the same. I don't know what algorithm to use but it needs to be fast. I can pre-process all the other images and put some info in the database which I use for the comparison.
Now what I want to know what the fastest way is to compare the images (with a good chance of being the same image). And what data I should save into the database (I could probably figure this one out myself if I got the algorithm).
It shouldn't take more then 5 minutes to compare the uploaded image to all the images in the database.
Thanks in advance!
Julian
Look at www.tineye.com, they have some kind of algorithm that I'm looking for. Guessing they use a very complex one, I just need one that does same thing but with lesser rate of succes.
回答1:
The way I would do it is I'd generate a really small (say.. 1/50 of the original image size) image from every image you're comparing against, and store the thumbnail image path along with the original size in the database. I'd keep the thumbnails as uncompressed bmp's for speed and loss-free-ness (I just made that word up!), since they're so small anyway.
To compare your new image against the other ones, shrink it down by the same amount and compare it against the others pixel by pixel, with a certain threshold (say.. 10% difference from the original).
If it passes this test, you can do a full blown pixel by pixel compare against the original image.
edit: I just want to mention that I went down the probabilistic way before too. It worked OK, but building the meta data for the images took forever, and there were a lot of false positives. Instinctively, I think that calculating local averages for each grid rectangle of your image (which is what shrinking your image down does) would give similar, if not better results.
回答2:
The best way for comparison is convert image to gray scale format and compare intensity of gray color. Its the fastest way used in real-time systems.
Also if you want to achieve higher qaullity and use colored images - use CIE 1994 or CIE 2000 as color difference formula
来源:https://stackoverflow.com/questions/4647793/c-sharp-image-comparison-fast-one