What you want to use is:
- Feature extraction
- Hashing
- Locally aware bloom hashing.
Most people use SIFT features, although I've had better experiences with not scale-invariant ones. Basically you use an edge detector to find interesting points and then center your image patches around those points. That way you can also detect sub-images.
What you implemented is a hash method. There's tons to try from, but yours should work fine :)
The crucial step to making it fast is to hash your hashes. You convert your values into unary representation and then take a random subset of the bits as the new hash. Do that with 20-50 random samples and you get 20-50 hash tables. If any feature matches 2 or more out of those 50 hash tables, the feature will be very similar to one you already stored. This allows you to convert the abs(x-y)
Hope it helps, if you'd like to try out my self-developed image similarity search, drop me a mail at hajo at spratpix