Dividing complex rows of dataframe to simple rows in Pyspark

后端 未结 3 1629
终归单人心
终归单人心 2020-11-27 21:01

I have this code:

from pyspark import SparkContext
from pyspark.sql import SQLContext, Row

sc = SparkContext()
sqlContext = SQLContext(sc)
documents = sqlCo         


        
3条回答
  •  不知归路
    2020-11-27 21:19

    Just explode it:

    from pyspark.sql.functions import explode
    
    documents.withColumn("title", explode("title"))
    ## +---+----------------+
    ## | id|           title|
    ## +---+----------------+
    ## |  1|     [1000,cars]|
    ## |  2|  [50,horse bus]|
    ## |  2|[100,normal bus]|
    ## |  3| [5000,Airplane]|
    ## |  4|   [20,Bicycles]|
    ## |  4| [80,Motorbikes]|
    ## |  5|      [15,Trams]|
    ## +---+----------------+
    

提交回复
热议问题