Wikipedia says:
An empty Bloom filter is a bit array of m bits, all set to 0. There must also be k different hash functions defined, each of which map
Given:
n: how many items you expect to have in your filter (e.g. 216,553)p: your acceptable false positive rate {0..1} (e.g. 0.01 → 1%)we want to calculate:
m: the number of bits needed in the bloom filterk: the number of hash functions we should applyThe formulas:
m = -n*ln(p) / (ln(2)^2) the number of bits
k = m/n * ln(2) the number of hash functions
In our case:
m = -216553*ln(0.01) / (ln(2)^2) = 997263 / 0.48045 = 2,075,686 bits (253 kB)k = m/n * ln(2) = 2075686/216553 * 0.693147 = 6.46 hash functions (7 hash functions)Note: Any code released into public domain. No attribution required.