Making histogram with Spark DataFrame column

前端 未结 6 1954
盖世英雄少女心
盖世英雄少女心 2020-12-16 03:18

I am trying to make a histogram with a column from a dataframe which looks like

DataFrame[C0: int, C1: int, ...]

If I were to make a histog

6条回答
  •  攒了一身酷
    2020-12-16 03:30

    Let's say your values in C1 are between 1-1000 and you want to get a histogram of 10 bins. You can do something like: df.withColumn("bins", df.C1/100).groupBy("bins").count() If your binning is more complex you can make a UDF for it (and at worse, you might need to analyze the column first, e.g. by using describe or through some other method).

提交回复
热议问题