Fast algorithm for repeated calculation of percentile?

前端 未结 6 748
余生分开走
余生分开走 2020-12-12 17:21

In an algorithm I have to calculate the 75th percentile of a data set whenever I add a value. Right now I am doing this:

  1. Get value x
  2. Inse
6条回答
  •  野趣味
    野趣味 (楼主)
    2020-12-12 18:11

    If you can do with an approximate answer, you can use a histogram instead of keeping entire values in memory.

    For each new value, add it to the appropriate bin. Calculate percentile 75th by traversing bins and summing counts until 75% of the population size is reached. Percentile value is between bin's (which you stopped at) low bound to high bound.

    This will provide O(B) complexity where B is the count of bins, which is range_size/bin_size. (use bin_size appropriate to your user case).

    I have implemented this logic in a JVM library: https://github.com/IBM/HBPE which you can use as a reference.

提交回复
热议问题