How does the HyperLogLog algorithm work?

后端 未结 3 1119
小蘑菇
小蘑菇 2020-11-29 14:13

I\'ve been learning about different algorithms in my spare time recently, and one that I came across which appears to be very interesting is called the HyperLogLog algorithm

3条回答
  •  清歌不尽
    2020-11-29 15:14

    The intuition is if your input is a large set of random number (e.g. hashed values), they should distribute evenly over a range. Let's say the range is up to 10 bit to represent value up to 1024. Then observed the minimum value. Let's say it is 10. Then the cardinality will estimated to be about 100 (10 × 100 ≈ 1024).

    Read the paper for the real logic of course.

    Another good explanation with sample code can be found here:
    Damn Cool Algorithms: Cardinality Estimation - Nick's Blog

提交回复
热议问题