How many hash functions does my bloom filter need?

前端 未结 5 423
旧巷少年郎
旧巷少年郎 2020-12-12 23:10

Wikipedia says:

An empty Bloom filter is a bit array of m bits, all set to 0. There must also be k different hash functions defined, each of which map

5条回答
  •  再見小時候
    2020-12-12 23:23

    Given:

    • n: how many items you expect to have in your filter (e.g. 216,553)
    • p: your acceptable false positive rate {0..1} (e.g. 0.01 → 1%)

    we want to calculate:

    • m: the number of bits needed in the bloom filter
    • k: the number of hash functions we should apply

    The formulas:

    m = -n*ln(p) / (ln(2)^2) the number of bits
    k = m/n * ln(2) the number of hash functions

    In our case:

    • m = -216553*ln(0.01) / (ln(2)^2) = 997263 / 0.48045 = 2,075,686 bits (253 kB)
    • k = m/n * ln(2) = 2075686/216553 * 0.693147 = 6.46 hash functions (7 hash functions)

    Note: Any code released into public domain. No attribution required.

提交回复
热议问题