A space efficient data structure to store and look-up through a large set of (uniformly distributed) Integers

后端 未结 7 2232
悲哀的现实
悲哀的现实 2021-01-06 22:28

I\'m required to hold, in memory, and look-up through one million uniformly distributed integers. My workload is extremely look-up intensive.
My current implementation u

7条回答
  •  谎友^
    谎友^ (楼主)
    2021-01-06 22:55

    If you are willing to accept a small chance of a false positive in return for a large reduction in memory usage, then a Bloom filter may be just what you need.

    A Bloom filter consists of k hash functions and a table of n bits, initially empty. To add an item to the table, feed it to each of the k hash functions (getting a number between 0 and n−1) and set the corresponding bit. To check if an item is in the table, feed it to each of the k hash functions and see if all corresponding k bits are set.

    A Bloom filter with a 1% false positive rate requires about 10 bits per item; the false positive rate decreases rapidly as you add more bits per item.

    Here's an open-source implementation in Java.

提交回复
热议问题