percentile

Adding Different Percentiles in boxplots in R

走远了吗. 提交于 2019-12-04 12:57:32
问题 I am failry new to R and recently used it to make some Boxplots. I also added the mean and standard deviation in my boxplot. I was wondering if i could add some kind of tick mark or circle in different percentile as well. Let's say if i want to mark the 85th, $ 90th percentile in each HOUR boxplot, is there a way to do this? My data consist of a year worth of loads in MW in each hour & My output consist of 24 boxplots for each hour for each month. I am doing each month at a time because i am

Conditional array to calculate percentiles

五迷三道 提交于 2019-12-04 11:57:01
I have some data as follows: val crit perc 0.415605498 1 perc1 0.475426007 1 perc1 0.418621318 1 perc1 0.51608229 1 perc1 0.452307882 1 perc1 0.496691416 1 perc1 0.402689126 1 perc1 0.494381345 1 perc1 0.532406777 1 perc1 0.839352016 2 perc2 0.618221702 2 perc2 0.83947033 2 perc2 0.621734007 2 perc2 0.548656662 2 perc2 0.711919796 2 perc2 0.758178085 2 perc2 0.820954467 2 perc2 0.478645786 2 perc2 0.848323655 2 perc2 0.844986383 2 perc2 0.418155292 2 perc2 1.182637063 3 perc3 1.248876472 3 perc3 1.218368809 3 perc3 0.664934398 3 perc3 0.951692853 3 perc3 0.848111264 3 perc3 0.58887439 3 perc3

SQL rank percentile

有些话、适合烂在心里 提交于 2019-12-04 06:54:19
问题 I've made an SQL query which rank pages by how many times they have been viewed. For instance, ╔══════╦═══════╗ ║ PAGE ║ VIEWS ║ ╠══════╬═══════╣ ║ J ║ 100 ║ ║ Q ║ 77 ║ ║ 3 ║ 55 ║ ║ A ║ 23 ║ ║ 2 ║ 6 ║ ╚══════╩═══════╝ Now what I would like to do is find the percentile rank of each page using an SQL query. The math I would like to use for this is simple enough, I just want to take the row number of the already generated table divided by the total number of rows. Or 1 minus this value,

get top and bottom 25th percentile average

匿名 (未验证) 提交于 2019-12-03 10:24:21
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 由 翻译 强力驱动 问题: I have a table with list of employees and the number of units that they have sold. I want to get the top 25 percentile Avg units sold and Bottom 25 percentile Avg units sold. I have created a representation of my data SLQ Fiddle I really have no idea how to start on this? All the examples i see are for SQL Server and not MySQL. Here is what i am thinking. I want 25 percentile and cant limit to 25 items. Basically it would involve: 1 ) #_of_employees = The number of total employees. 2 ) #_of_employees_in_25_percentile = #_of

Calculating average and percentiles from a histogram map?

匿名 (未验证) 提交于 2019-12-03 08:35:02
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I have written a timer which will measure the performance of a particular code in any multithreaded application. In the below timer, it will also populate the map with how many calls took x milliseconds. I will use this map as part of my histogram to do further analysis, like what percentage of calls took this much milliseconds and etc. public static class StopWatch { public static ConcurrentHashMap<Long, Long> histogram = new ConcurrentHashMap<Long, Long>(); /** * Creates an instance of the timer and starts it running. */ public static

Adding Different Percentiles in boxplots in R

元气小坏坏 提交于 2019-12-03 08:19:16
I am failry new to R and recently used it to make some Boxplots. I also added the mean and standard deviation in my boxplot. I was wondering if i could add some kind of tick mark or circle in different percentile as well. Let's say if i want to mark the 85th, $ 90th percentile in each HOUR boxplot, is there a way to do this? My data consist of a year worth of loads in MW in each hour & My output consist of 24 boxplots for each hour for each month. I am doing each month at a time because i am not sure if there is a way to run all 96(Each month, weekday/weekend , for 4 different zones) boxplots

PostgreSQL equivalent of Oracle's PERCENTILE_CONT function

走远了吗. 提交于 2019-12-03 07:40:18
问题 Has anyone found a PostgreSQL equivalent of Oracle's PERCENTILE_CONT function? I searched, and could not find one, so I wrote my own. Here is the solution that I hope helps you out. The company I work for wanted to migrate a Java EE web application from using an Oracle database over to using PostgreSQL. Several stored procedures relied heavily upon using Oracle's unique PERCENTILE_CONT() function. This function does not exist in PostgreSQL. I tried searching to see if anyone had "ported over"

Percentile for Each Observation w/r/t Grouping Variable

匿名 (未验证) 提交于 2019-12-03 02:50:02
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I have some data that looks like the following. It is grouped by variable "Year" and I want to extract the percentiles of each observation of Score, with respect to the Year it is from, preferably as a vector. Year Score 2001 89 2001 70 2001 72 2001 ... .......... 2004 87 2004 90 etc. How can I do this? aggregate will not work, and I do not think apply will work either. 回答1: Following up on Vince's solution, you can also do this with plyr or by : ddply(df, .(years), function(x) transform(x, percentile=ecdf(x$scores)(x$scores))) 回答2: Using

How to find exact median for grouped data in Spark

匿名 (未验证) 提交于 2019-12-03 02:38:01
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I have a requirement to calculate exact median on grouped data set of Double datatype in Spark using Scala. It is different from the similar query: Find median in spark SQL for multiple double datatype columns . This question is about the finding data for grouped data, whereas the other one is about finding median on a RDD level. Here is my sample data scala> sqlContext.sql("select * from test").show() +---+---+ | id|num| +---+---+ | A|0.0| | A|1.0| | A|1.0| | A|1.0| | A|0.0| | A|1.0| | B|0.0| | B|1.0| | B|1.0| +---+---+ Expected Answer: +--

How to make user defined functions for binned_statistic

匿名 (未验证) 提交于 2019-12-03 01:18:02
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I am using scipy stats package to take statistics along the an axis, but I am having trouble taking the percentile statistic using binned_statistic . I have generalized the code below, where I am trying taking the 10th percentile of a dataset with x, y values within a series of x bins, and it fails. I can of course do function options, like median, and even the numpy standard deviation using np.std . However, I cannot figure out how to use np.percentile because it requires 2 arguments (e.g. np.percentile(y, 10) ), but then it gives me a