Fast computation of pairs with least hamming distance

后端未结

关注

 1  1915

Problem

Suppose you have N (~100k-1m) integers/bitstrings each K (e.g. 256) bits long. The algorithm should return the k pairs with the lowest pairwise Hamming dis

相关标签:

1条回答

攒了一身酷

2020-12-18 11:08

The recent paper "The Closest Pair Problem under the Hamming Metric" has only algorithms involving an n^2 factor (unless K is very large). That is even for finding only a single pair. So it seems that it is hard to improve this unless you make further assumptions about the structure of your instances. For example, if you assume the Hamming distance is not very large, you could sample a few columns, hash the strings into buckets according to these under the assumptions that these columns match exactly, and then do pairwise comparison in each bucket separately. Repeat this for another set of random columns to minimize the probability you miss some pairs.

0 讨论(0)
发布评论:

提交评论
- 加载中...