Function to Calculate Median in SQL Server

前端 未结 30 3169
孤独总比滥情好
孤独总比滥情好 2020-11-22 04:03

According to MSDN, Median is not available as an aggregate function in Transact-SQL. However, I would like to find out whether it is possible to create this functionality (u

30条回答
  •  甜味超标
    2020-11-22 04:32

    My original quick answer was:

    select  max(my_column) as [my_column], quartile
    from    (select my_column, ntile(4) over (order by my_column) as [quartile]
             from   my_table) i
    --where quartile = 2
    group by quartile
    

    This will give you the median and interquartile range in one fell swoop. If you really only want one row that is the median then uncomment the where clause.

    When you stick that into an explain plan, 60% of the work is sorting the data which is unavoidable when calculating position dependent statistics like this.

    I've amended the answer to follow the excellent suggestion from Robert Ševčík-Robajz in the comments below:

    ;with PartitionedData as
      (select my_column, ntile(10) over (order by my_column) as [percentile]
       from   my_table),
    MinimaAndMaxima as
      (select  min(my_column) as [low], max(my_column) as [high], percentile
       from    PartitionedData
       group by percentile)
    select
      case
        when b.percentile = 10 then cast(b.high as decimal(18,2))
        else cast((a.low + b.high)  as decimal(18,2)) / 2
      end as [value], --b.high, a.low,
      b.percentile
    from    MinimaAndMaxima a
      join  MinimaAndMaxima b on (a.percentile -1 = b.percentile) or (a.percentile = 10 and b.percentile = 10)
    --where b.percentile = 5
    

    This should calculate the correct median and percentile values when you have an even number of data items. Again, uncomment the final where clause if you only want the median and not the entire percentile distribution.

提交回复
热议问题