percentile

Percentiles of Live Data Capture

不想你离开。 提交于 2019-11-27 17:15:17
I am looking for an algorithm that determines percentiles for live data capture. For example, consider the development of a server application. The server might have response times as follows: 17 ms 33 ms 52 ms 60 ms 55 ms etc. It is useful to report the 90th percentile response time, 80th percentile response time, etc. The naive algorithm is to insert each response time into a list. When statistics are requested, sort the list and get the values at the proper positions. Memory usages scales linearly with the number of requests. Is there an algorithm that yields "approximate" percentile

Calculating percentile rank in MySQL

北战南征 提交于 2019-11-27 13:14:22
I have a very big table of measurement data in MySQL and I need to compute the percentile rank for each and every one of these values. Oracle appears to have a function called percent_rank but I can't find anything similar for MySQL. Sure I could just brute-force it in Python which I use anyways to populate the table but I suspect that would be quite inefficient because one sample might have 200.000 observations. This is a relatively ugly answer, and I feel guilty saying it. That said, it might help you with your issue. One way to determine the percentage would be to count all of the rows, and

Map each list value to its corresponding percentile

北战南征 提交于 2019-11-27 00:23:21
I'd like to create a function that takes a (sorted) list as its argument and outputs a list containing each element's corresponding percentile. For example, fn([1,2,3,4,17]) returns [0.0, 0.25, 0.50, 0.75, 1.00] . Can anyone please either: Help me correct my code below? OR Offer a better alternative than my code for mapping values in a list to their corresponding percentiles? My current code: def median(mylist): length = len(mylist) if not length % 2: return (mylist[length / 2] + mylist[length / 2 - 1]) / 2.0 return mylist[length / 2] ###########################################################

matplotlib: disregard outliers when plotting

坚强是说给别人听的谎言 提交于 2019-11-27 00:22:11
问题 I'm plotting some data from various tests. Sometimes in a test I happen to have one outlier (say 0.1), while all other values are three orders of magnitude smaller. With matplotlib, I plot against the range [0, max_data_value] How can I just zoom into my data and not display outliers, which would mess up the x-axis in my plot? Should I simply take the 95 percentile and have the range [0, 95_percentile] on the x-axis? 回答1: There's no single "best" test for an outlier. Ideally, you should

Calculating percentile of dataset column

…衆ロ難τιáo~ 提交于 2019-11-26 19:49:51
问题 A quick one for you, dearest R gurus: I'm doing an assignment and I've been asked, in this exercise, to get basic statistics out of the infert dataset (it's in-built), and specifically one of its columns, infert$age . For anyone not familiar with the dataset: > table_ages # Which is just subset(infert, select=c("age")); age 1 26 2 42 3 39 4 34 5 35 6 36 7 23 8 32 9 21 10 28 11 29 ... 246 35 247 29 248 23 I've had to find median values of the column, variance, skewness, standard deviation

Weighted percentile using numpy

a 夏天 提交于 2019-11-26 19:45:14
问题 Is there a way to use the numpy.percentile function to compute weighted percentile? Or is anyone aware of an alternative python function to compute weighted percentile? thanks! 回答1: Unfortunately, numpy doesn't have built-in weighted functions for everything, but, you can always put something together. def weight_array(ar, weights): zipped = zip(ar, weights) weighted = [] for i in zipped: for j in range(i[1]): weighted.append(i[0]) return weighted np.percentile(weight_array(ar, weights), 25)

How do I calculate percentiles with python/numpy?

可紊 提交于 2019-11-26 19:21:17
Is there a convenient way to calculate percentiles for a sequence or single-dimensional numpy array? I am looking for something similar to Excel's percentile function. I looked in NumPy's statistics reference, and couldn't find this. All I could find is the median (50th percentile), but not something more specific. Jon W You might be interested in the SciPy Stats package. It has the percentile function you're after and many other statistical goodies. percentile() is available in numpy too. import numpy as np a = np.array([1,2,3,4,5]) p = np.percentile(a, 50) # return 50th percentile, e.g

Percentiles of Live Data Capture

你离开我真会死。 提交于 2019-11-26 18:55:13
问题 I am looking for an algorithm that determines percentiles for live data capture. For example, consider the development of a server application. The server might have response times as follows: 17 ms 33 ms 52 ms 60 ms 55 ms etc. It is useful to report the 90th percentile response time, 80th percentile response time, etc. The naive algorithm is to insert each response time into a list. When statistics are requested, sort the list and get the values at the proper positions. Memory usages scales

Calculating percentile rank in MySQL

别来无恙 提交于 2019-11-26 13:58:55
问题 I have a very big table of measurement data in MySQL and I need to compute the percentile rank for each and every one of these values. Oracle appears to have a function called percent_rank but I can't find anything similar for MySQL. Sure I could just brute-force it in Python which I use anyways to populate the table but I suspect that would be quite inefficient because one sample might have 200.000 observations. 回答1: This is a relatively ugly answer, and I feel guilty saying it. That said,

Map each list value to its corresponding percentile

混江龙づ霸主 提交于 2019-11-26 12:22:02
问题 I\'d like to create a function that takes a (sorted) list as its argument and outputs a list containing each element\'s corresponding percentile. For example, fn([1,2,3,4,17]) returns [0.0, 0.25, 0.50, 0.75, 1.00] . Can anyone please either: Help me correct my code below? OR Offer a better alternative than my code for mapping values in a list to their corresponding percentiles? My current code: def median(mylist): length = len(mylist) if not length % 2: return (mylist[length / 2] + mylist