How to compute percentiles in Apache Spark

后端 未结 10 534
遥遥无期
遥遥无期 2020-12-04 22:08

I have an rdd of integers (i.e. RDD[Int]) and what I would like to do is to compute the following ten percentiles: [0th, 10th, 20th, ..., 90th, 100th]

10条回答
  •  悲&欢浪女
    2020-12-04 23:02

    Another alternative way can be to use top and last on RDD of double. For example, val percentile_99th_value=scores.top((count/100).toInt).last

    This method is more suited for individual percentiles.

提交回复
热议问题