PySpark - Explode columns into rows based on the type of the column

后端 未结 2 2032
情歌与酒
情歌与酒 2021-01-27 06:12

Given a Dataframe:

+---+-----------+---------+-------+------------+
| id|      score|tx_amount|isValid|    greeting|
+---+-----------+---------+-------+---------         


        
2条回答
  •  天命终不由人
    2021-01-27 06:36

    you can try several unions :

    
    df = df.select(
        "id",
        F.col("score").cast("string").alias("col_value"),
        F.lit("Y").alias("is_score"),
        F.lit("N").alias("is_amount"),
        F.lit("N").alias("is_boolean"),
        F.lit("N").alias("is_text"),
    ).union(df.select(
        "id",
        F.col("tx_amount").cast("string").alias("col_value"),
        F.lit("N").alias("is_score"),
        F.lit("Y").alias("is_amount"),
        F.lit("N").alias("is_boolean"),
        F.lit("N").alias("is_text"),
    )).union(...) # etc
    
    

提交回复
热议问题