phash

Using SOLR to calculate “similarity”/“bitcount” between two ulongs

佐手、 提交于 2019-11-30 10:00:42
We have a database of images where I have calculated the PHASH using Dr. Neal Krawetz's method as implemented by David Oftedal . Part of the sample code calculates the difference between these longs is here: ulong hash1 = AverageHash(theImage); ulong hash2 = AverageHash(theOtherImage); uint BitCount(ulong theNumber) { uint count = 0; for (; theNumber > 0; theNumber >>= 8) { count += bitCounts[(theNumber & 0xFF)]; } return count; } Console.WriteLine("Similarity: " + ((64 - BitCount(hash1 ^ hash2)) * 100.0) / 64.0 + "%"); The challenge is that I only know one of these hashes and I want to query

How to build pHash on MacOSX Lion (using latest ffmpeg-devel)

我是研究僧i 提交于 2019-11-29 23:27:49
问题 Building pHash 0.9.4 on OSX can turn out to be tricky. For those of you who've run into issues, my somewhat lengthy answer below might help. 回答1: Make sure you've got macports fully updated and working. This means a recent Xcode, and inside Xcode preferences->downloads->components install Command-Line Tools ! $ sudo port selfupdate # if you've had previous build issues: $ sudo port clean --all # get pHash wget http://www.phash.org/releases/pHash-0.9.4.tar.gz tar zxvf pHash-0.9.4.tar.gz cd

Similar image search by pHash distance in Elasticsearch

寵の児 提交于 2019-11-28 03:15:59
Similar image search problem Millions of images pHash 'ed and stored in Elasticsearch. Format is "11001101...11" (length 64), but can be changed (better not). Given subject image's hash "100111..10" we want to find all similar image hashes in Elasticsearch index within hamming distance of 8 . Of course, query can return images with greater distance than 8 and script in Elasticsearch or outside can filter the result set. But total search time must be within 1 second or so. Our current mapping Each document has nested images field that contains image hashes: { "images": { "type": "nested",

Similar image search by pHash distance in Elasticsearch

徘徊边缘 提交于 2019-11-26 23:57:38
问题 Similar image search problem Millions of images pHash'ed and stored in Elasticsearch. Format is "11001101...11" (length 64), but can be changed (better not). Given subject image's hash "100111..10" we want to find all similar image hashes in Elasticsearch index within hamming distance of 8 . Of course, query can return images with greater distance than 8 and script in Elasticsearch or outside can filter the result set. But total search time must be within 1 second or so. Our current mapping