PySpark- How to use a row value from one column to access another column which has the same name as of the row value

后端 未结 2 1183
猫巷女王i
猫巷女王i 2020-12-17 05:12

I have a PySpark df:

+---+---+---+---+---+---+---+---+
| id| a1| b1| c1| d1| e1| f1|ref|
+---+---+---+---+---+---+---+---+
|  0|  1| 23|  4|  8|  9|  5| b1|
         


        
2条回答
  •  不知归路
    2020-12-17 06:10

    The OP has asked python solution. I'm just answering the same in spark-scala 2.X for reference. Hope it helps somebody

    scala> val df = Seq((0, 1, 23, 4, 8, 9, 5, "b1"), (1, 2, 43, 8, 10, 20, 43, "e1"), (2,  3, 15,  0,  1, 23,  7, "b1"),(3,  4,  2,  6, 11,  5,  8, "d1"),(4,  5,  6,  7,  2,  8,  1, "f1")).toDF("id", "a1", "b1", "c1", "d1", "e1", "f1", "ref")
    df: org.apache.spark.sql.DataFrame = [id: int, a1: int ... 6 more fields]
    
    scala> df.show(false)
    +---+---+---+---+---+---+---+---+
    |id |a1 |b1 |c1 |d1 |e1 |f1 |ref|
    +---+---+---+---+---+---+---+---+
    |0  |1  |23 |4  |8  |9  |5  |b1 |
    |1  |2  |43 |8  |10 |20 |43 |e1 |
    |2  |3  |15 |0  |1  |23 |7  |b1 |
    |3  |4  |2  |6  |11 |5  |8  |d1 |
    |4  |5  |6  |7  |2  |8  |1  |f1 |
    +---+---+---+---+---+---+---+---+
    
    
    scala> val colx = df.columns.filter(x=>x!="ref").filter(x=>x!="id")
    colx: Array[String] = Array(a1, b1, c1, d1, e1, f1)
    
    scala> val colm = colx.map( x=> when(col("ref")===lit(x),col(x)) )
    colm: Array[org.apache.spark.sql.Column] = Array(CASE WHEN (ref = a1) THEN a1 END, CASE WHEN (ref = b1) THEN b1 END, CASE WHEN (ref = c1) THEN c1 END, CASE WHEN (ref = d1) THEN d1 END, CASE WHEN (ref = e1) THEN e1 END, CASE WHEN (ref = f1) THEN f1 END)
    
    scala> df.select(col("*"),concat_ws("",array(colm:_*)).as("res1")).show(false)
    +---+---+---+---+---+---+---+---+----+
    |id |a1 |b1 |c1 |d1 |e1 |f1 |ref|res1|
    +---+---+---+---+---+---+---+---+----+
    |0  |1  |23 |4  |8  |9  |5  |b1 |23  |
    |1  |2  |43 |8  |10 |20 |43 |e1 |20  |
    |2  |3  |15 |0  |1  |23 |7  |b1 |15  |
    |3  |4  |2  |6  |11 |5  |8  |d1 |11  |
    |4  |5  |6  |7  |2  |8  |1  |f1 |1   |
    +---+---+---+---+---+---+---+---+----+
    
    
    scala>
    

提交回复
热议问题