How to transform DataFrame before joining operation?

前端 未结 1 1979
旧时难觅i
旧时难觅i 2020-12-22 10:00

The following code is used to extract ranks from the column products. The ranks are second numbers in each pair [...]. For example, in the given ex

相关标签:
1条回答
  • 2020-12-22 10:18

    As per my answer here, you can transform df_products using something like this:

    import org.apache.spark.sql.functions.explode
    df1 = df.withColumn("array_elem", explode(df("products"))
    df2 = df1.select("product_PK", "array_elem.*")
    

    This assumes products is an array of structs. If products is an array of array, you can use the following instead:

    df2 = df1.withColumn("rank", df2("products").getItem(1))
    
    0 讨论(0)
提交回复
热议问题