I have a Spark DataFrame
df
with five columns. I want to add another column with its values being the tuple of the first and second columns. When u
You can use struct function which creates a tuple of provided columns:
import org.apache.spark.sql.functions.struct
val df = Seq((1,2), (3,4), (5,3)).toDF("a", "b")
df.withColumn("NewColumn", struct(df("a"), df("b")).show(false)
+---+---+---------+
|a |b |NewColumn|
+---+---+---------+
|1 |2 |[1,2] |
|3 |4 |[3,4] |
|5 |3 |[5,3] |
+---+---+---------+