How to calculate or approximate the median of a list without storing the list

后端 未结 10 1191
你的背包
你的背包 2020-11-28 22:24

I\'m trying to calculate the median of a set of values, but I don\'t want to store all the values as that could blow memory requirements. Is there a way of calculating or ap

10条回答
  •  刺人心
    刺人心 (楼主)
    2020-11-28 22:51

    I picked up the idea of iterative quantile calculation. It is important to have a good value for starting point and eta, these may come from mean and sigma. So I programmed this:

    Function QuantileIterative(Var x : Array of Double; n : Integer; p, mean, sigma : Double) : Double;
    Var eta, quantile,q1, dq : Double;
        i : Integer;
    Begin
      quantile:= mean + 1.25*sigma*(p-0.5);
      q1:=quantile;
      eta:=0.2*sigma/xy(1+n,0.75); // should not be too large! sets accuracy
      For i:=1 to n Do 
         quantile := quantile + eta * (signum_smooth(x[i] - quantile,eta) + 2*p - 1);
      dq:=abs(q1-quantile);
      If dq>eta
         then Begin
              If dq<3*eta then eta:=eta/4;
              For i:=1 to n Do 
                 quantile := quantile + eta * (signum_smooth(x[i] - quantile,eta) + 2*p - 1);
         end;
      QuantileIterative:=quantile
    end;
    

    As the median for two elements would be the mean, I used a smoothed signum function, and xy() is x^y. Are there ideas to make it better? Of course if we have some more a-priori knowledge we can add code using min and max of the array, skew, etc. For big data you would not use an array perhaps, but for testing it is easier.

提交回复
热议问题