approxQuantile give incorrect Median in Spark (Scala)?

后端 未结 3 766
太阳男子
太阳男子 2020-12-19 09:53

I have this test data:

 val data = List(
        List(47.5335D),
        List(67.5335D),
        List(69.5335D),
        List(444.1235D),
        List(677.53         


        
3条回答
  •  余生分开走
    2020-12-19 10:20

    Note that this is an approximate quantiles computation. It is not supposed to give you the exact answer all the time. See here for a more thorough explanation.

    The reason is that for very large datasets, sometimes you are OK with an approximate answer, as long as you get it significantly faster than the exact computation.

提交回复
热议问题