I\'d like to create a function that takes a (sorted) list as its argument and outputs a list containing each element\'s corresponding percentile.
For example,
for a pure python function to calculate a percentile score for a given item, compared to the population distribution (a list of scores), I pulled this from the scipy source code and removed all references to numpy:
def percentileofscore(a, score, kind='rank'):
n = len(a)
if n == 0:
return 100.0
left = len([item for item in a if item < score])
right = len([item for item in a if item <= score])
if kind == 'rank':
pct = (right + left + (1 if right > left else 0)) * 50.0/n
return pct
elif kind == 'strict':
return left / n * 100
elif kind == 'weak':
return right / n * 100
elif kind == 'mean':
pct = (left + right) / n * 50
return pct
else:
raise ValueError("kind can only be 'rank', 'strict', 'weak' or 'mean'")
source: https://github.com/scipy/scipy/blob/v1.2.1/scipy/stats/stats.py#L1744-L1835
Given that calculating percentiles is trickier than one would think, but way less complicated than the full scipy/numpy/scikit package, this is the best for light-weight deployment. The original code filters for only nonzero-values better, but otherwise, the math is the same. The optional parameter controls how it handles values that are in between two other values.
For this use case, one can call this function for each item in a list using the map() function.