get top and bottom 25th percentile average

匿名 (未验证) 提交于 2019-12-03 10:24:21

问题:

I have a table with list of employees and the number of units that they have sold.

I want to get the top 25 percentile Avg units sold and Bottom 25 percentile Avg units sold.

I have created a representation of my data SLQ Fiddle

I really have no idea how to start on this? All the examples i see are for SQL Server and not MySQL. Here is what i am thinking.

I want 25 percentile and cant limit to 25 items. Basically it would involve:

1) #_of_employees = The number of total employees.   2) #_of_employees_in_25_percentile =  #_of_employees*0.25   3) Calculate the sum of the units sold by the top/bottom 25 percentile (limit #_of_employees_in_25_percentile)  4) Divide the sum by #_of_employees_in_25_percentile to get the average.

How can all this be done efficiently in MySQL?

回答1:

This is a solution that uses a devious trick I learned from this question.

SELECT id, unit_sold, n * 100 / @total AS percentile FROM (   SELECT id, unit_sold, @total := @total + unit_sold AS n   FROM mydata, (SELECT @total := 0) AS total   ORDER BY unit_sold ASC ) AS t

SQL Fiddle.



回答2:

How about going with this logic:

  • Select all, order by percentile (DESC), limit to 25
  • Select all, order by percentile (ASC), limit to 25

Is this the type of logic you're looking for ?

Sample queries:

$q1 = mysql_query(SELECT * FROM table_name ORDER BY percentile DESC LIMIT 25) $q2 = mysql_query(SELECT * FROM table_name ORDER BY percentile ASC LIMIT 25)


回答3:

What about this?

SELECT   SUM(unit_sold) AS sum_tot, SUM(unit_sold)/count(id) AS average,   SUM(CASE WHEN percentile<25 THEN unit_sold ELSE 0 END) AS sum_top25,   SUM(CASE WHEN percentile<25 THEN 1 ELSE 0 END) AS count_top25,   SUM(CASE WHEN percentile<25 THEN unit_sold ELSE 0 END)/SUM(CASE WHEN percentile<25 THEN 1 ELSE 0 END) AS average_top25,   SUM(CASE WHEN percentile>75 THEN unit_sold ELSE 0 END) AS sum_bottom25,   SUM(CASE WHEN percentile>75 THEN 1 ELSE 0 END) AS count_bottom25,   SUM(CASE WHEN percentile>75 THEN unit_sold ELSE 0 END)/SUM(CASE WHEN percentile>75 THEN 1 ELSE 0 END) AS average_bottom25 FROM (SELECT   id, unit_sold, c * 100 / @counter AS percentile FROM    (SELECT       m.*, @counter:=@counter+1 AS c     FROM      (SELECT @counter:=0) AS initvar, mydata AS m    ORDER BY unit_sold desc    ) AS t WHERE   c <= (25/100 * @counter)   OR c >= (75/100 * @counter) ) AS t2

Output:

SUM_TOT AVERAGE     SUM_TOP25   COUNT_TOP25 AVERAGE_TOP25 SUM_BOTTOM25  COUNT_BOTTOM25  AVERAGE_BOTTOM25 850     283.3333    500         1            500          350           2               175

See SQL Fiddle.

The idea is to use the MySQL: LIMIT by a percentage of the amount of records? solution to get the percentiles. Based on that (and on pdw answer) we create an output in which we just show the top 25% and bottom 75%.

Finally, we count and sum to get the values you requested.


Note this runs on top of the command:

SELECT   id, unit_sold, c * 100 / @counter AS percentile FROM    (SELECT       m.*, @counter:=@counter+1 AS c     FROM      (SELECT @counter:=0) AS initvar, mydata AS m    ORDER BY unit_sold desc    ) AS t WHERE   c <= (25/100 * @counter)   OR c >= (75/100 * @counter)

Whose output is:

ID  UNIT_SOLD   PERCENTILE d   500         20 a   250         80 e   100         100


易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!