Extract column values of Dataframe as List in Apache Spark

后端 未结 10 1102
慢半拍i
慢半拍i 2020-12-22 16:52

I want to convert a string column of a data frame to a list. What I can find from the Dataframe API is RDD, so I tried converting it back to RDD first, and then

10条回答
  •  刺人心
    刺人心 (楼主)
    2020-12-22 17:29

    from pyspark.sql.functions import col
    
    df.select(col("column_name")).collect()
    

    here collect is functions which in turn convert it to list. Be ware of using the list on the huge data set. It will decrease performance. It is good to check the data.

提交回复
热议问题