Extract column values of Dataframe as List in Apache Spark

后端 未结 10 1069
慢半拍i
慢半拍i 2020-12-22 16:52

I want to convert a string column of a data frame to a list. What I can find from the Dataframe API is RDD, so I tried converting it back to RDD first, and then

10条回答
  •  遥遥无期
    2020-12-22 17:37

    I know the answer given and asked for is assumed for Scala, so I am just providing a little snippet of Python code in case a PySpark user is curious. The syntax is similar to the given answer, but to properly pop the list out I actually have to reference the column name a second time in the mapping function and I do not need the select statement.

    i.e. A DataFrame, containing a column named "Raw"

    To get each row value in "Raw" combined as a list where each entry is a row value from "Raw" I simply use:

    MyDataFrame.rdd.map(lambda x: x.Raw).collect()
    

提交回复
热议问题