median

Finding the median of an unsorted array

不打扰是莪最后的温柔 提交于 2019-11-26 08:07:58
问题 To find the median of an unsorted array, we can make a min-heap in O(nlogn) time for n elements, and then we can extract one by one n/2 elements to get the median. But this approach would take O(nlogn) time. Can we do the same by some method in O(n) time? If we can, then please tell or suggest some method. 回答1: You can use the Median of Medians algorithm to find median of an unsorted array in linear time. 回答2: I have already upvoted the @dasblinkenlight answer since the Median of Medians

O(n) algorithm to find the median of a collection of numbers

淺唱寂寞╮ 提交于 2019-11-26 07:29:13
问题 Problem: input is a (not necessarily sorted) sequence S = k1, k2, ..., kn of n arbitrary numbers. Consider the collection C of n² numbers of the form min{ki,kj}, for 1 <=i, j<=n. Present an O(n) time and O(n) space algorithm to find the median of C. So far I\'ve found by examining C for different sets S that the number of instances of the smallest number in S in C is equal to (2n-1), the next smallest number: (2n-3) and so on until you only have one instance of the largest number. Is there a

Rolling median algorithm in C

ε祈祈猫儿з 提交于 2019-11-26 06:55:30
问题 I am currently working on an algorithm to implement a rolling median filter (analogous to a rolling mean filter) in C. From my search of the literature, there appear to be two reasonably efficient ways to do it. The first is to sort the initial window of values, then perform a binary search to insert the new value and remove the existing one at each iteration. The second (from Hardle and Steiger, 1995, JRSS-C, Algorithm 296) builds a double-ended heap structure, with a maxheap on one end, a

how to calculate mean/median per group in a dataframe in r [duplicate]

廉价感情. 提交于 2019-11-26 05:35:35
问题 This question already has an answer here: Mean per group in a data.frame [duplicate] 8 answers I have a dataframe recording how much money a costomer spend in detail like the following: custid, value 1, 1 1, 3 1, 2 1, 5 1, 4 1, 1 2, 1 2, 10 3, 1 3, 2 3, 5 How to calcuate the charicteristics using mean,max,median,std, etc like the following? Use some apply function? And how? custid, mean, max,min,median,std 1, .... 2,.... 3,.... 回答1: To add to the alternatives, here's summaryBy from the "doBy"

Find running median from a stream of integers

旧时模样 提交于 2019-11-26 02:59:25
问题 Possible Duplicate: Rolling median algorithm in C Given that integers are read from a data stream. Find median of elements read so far in efficient way. Solution I have read: We can use a max heap on left side to represent elements that are less than the effective median, and a min heap on right side to represent elements that are greater than the effective median. After processing an incoming element, the number of elements in heaps differ at most by 1 element. When both heaps contain the

Function to Calculate Median in SQL Server

我的梦境 提交于 2019-11-26 01:22:02
问题 According to MSDN, Median is not available as an aggregate function in Transact-SQL. However, I would like to find out whether it is possible to create this functionality (using the Create Aggregate function, user defined function, or some other method). What would be the best way (if possible) to do this - allow for the calculation of a median value (assuming a numeric data type) in an aggregate query? 回答1: 2019 UPDATE: In the 10 years since I wrote this answer, more solutions have been

How to find median and quantiles using Spark

痴心易碎 提交于 2019-11-26 00:24:02
问题 How can I find median of an RDD of integers using a distributed method, IPython, and Spark? The RDD is approximately 700,000 elements and therefore too large to collect and find the median. This question is similar to this question. However, the answer to the question is using Scala, which I do not know. How can I calculate exact median with Apache Spark? Using the thinking for the Scala answer, I am trying to write a similar answer in Python. I know I first want to sort the RDD . I do not

Simple way to calculate median with MySQL

穿精又带淫゛_ 提交于 2019-11-25 22:49:20
问题 What\'s the simplest (and hopefully not too slow) way to calculate the median with MySQL? I\'ve used AVG(x) for finding the mean, but I\'m having a hard time finding a simple way of calculating the median. For now, I\'m returning all the rows to PHP, doing a sort, and then picking the middle row, but surely there must be some simple way of doing it in a single MySQL query. Example data: id | val -------- 1 4 2 7 3 2 4 2 5 9 6 8 7 3 Sorting on val gives 2 2 3 4 7 8 9 , so the median should be