Map each list value to its corresponding percentile

后端 未结 9 607
夕颜
夕颜 2020-11-29 00:11

I\'d like to create a function that takes a (sorted) list as its argument and outputs a list containing each element\'s corresponding percentile.

For example,

9条回答
  •  北荒
    北荒 (楼主)
    2020-11-29 00:19

    for a pure python function to calculate a percentile score for a given item, compared to the population distribution (a list of scores), I pulled this from the scipy source code and removed all references to numpy:

    def percentileofscore(a, score, kind='rank'):    
        n = len(a)
        if n == 0:
            return 100.0
        left = len([item for item in a if item < score])
        right = len([item for item in a if item <= score])
        if kind == 'rank':
            pct = (right + left + (1 if right > left else 0)) * 50.0/n
            return pct
        elif kind == 'strict':
            return left / n * 100
        elif kind == 'weak':
            return right / n * 100
        elif kind == 'mean':
            pct = (left + right) / n * 50
            return pct
        else:
            raise ValueError("kind can only be 'rank', 'strict', 'weak' or 'mean'")
    

    source: https://github.com/scipy/scipy/blob/v1.2.1/scipy/stats/stats.py#L1744-L1835

    Given that calculating percentiles is trickier than one would think, but way less complicated than the full scipy/numpy/scikit package, this is the best for light-weight deployment. The original code filters for only nonzero-values better, but otherwise, the math is the same. The optional parameter controls how it handles values that are in between two other values.

    For this use case, one can call this function for each item in a list using the map() function.

提交回复
热议问题