show distinct column values in pyspark dataframe: python

后端 未结 9 879
忘了有多久
忘了有多久 2020-12-23 10:55

Please suggest pyspark dataframe alternative for Pandas df[\'col\'].unique().

I want to list out all the unique values in a pyspark dataframe column.

9条回答
  •  一向
    一向 (楼主)
    2020-12-23 11:43

    collect_set can help to get unique values from a given column of pyspark.sql.DataFrame df.select(F.collect_set("column").alias("column")).first()["column"]

提交回复
热议问题