Pyspark: show histogram of a data frame column

前端 未结 5 1980
情书的邮戳
情书的邮戳 2020-12-14 01:04

In pandas data frame, I am using the following code to plot histogram of a column:

my_df.hist(column = \'field_1\')

Is there something that

5条回答
  •  庸人自扰
    2020-12-14 01:57

    This is straightforward and works well.

    df.groupby(
      ''
    ).count().select(
      'count'
    ).rdd.flatMap(
      lambda x: x
    ).histogram(20)
    

提交回复
热议问题