Metric for finding similar images in a database

喜欢而已 提交于 2019-12-03 00:31:58

There's one little confusing thing in your question: the "fingerprint" you linked to is explicitly not meant to find similar images (quote):

TinEye does not typically find similar images (i.e. a different image with the same subject matter); it finds exact matches including those that have been cropped, edited or resized.

Now, that said, I'm just going to assume you know what you are asking, and that you actually want to be able to find all similar images, not just edited exact copies.

If you want to try and get into it in detail, I would suggest looking up papers by Sivic, Zisserman and Nister, Stewenius. The idea these two papers (as well as quite a bit of others lately) have been using is to try and apply text-searching techniques to image databases, and search the image database in a same manner Google would search it's document (web-page) database.

The first paper I have linked to is a good starting point for this kind of approach, since it addresses mainly the big question: What are the "words" in the images?. Text searching techniques all focus on words, and base their similarity measures on calculations including word counts. Successful representation of images as collections of visual words is thus the first step to applying text-searching techniques to image databases.

The second paper then expands on the idea of using text-techniques, presenting a more suitable search structure. With this, they allow for a faster image retrieval and larger image databases. They also propose how to construct an image descriptor based on the underlying search structure.

The features used as visual words in both papers should satisfy your invariance constraints, and the second one definitely should be able to work with your required database size (maybe even the approach from the 1st paper would work).

Finally, I recommend looking up newer papers from the same authors (I'm positive Nister did something new, it's just that the approach from the linked paper has been enough for me until now), looking up some of their references and just generally searching for papers concerning Content based image (indexing and) retrieval (CBIR) - it is a very popular subject right now, so there should be plenty.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!