I have the following DataFrame:
January | February | March
-----------------------------
10 | 10 | 10
20 | 20 | 20
50 | 50
This code is in Python, but it can be easily translated:
# First we create a RDD in order to create a dataFrame:
rdd = sc.parallelize([(10, 10,10), (20, 20,20)])
df = rdd.toDF(['January', 'February', 'March'])
df.show()
# Here, we create a new column called 'TOTAL' which has results
# from add operation of columns df.January, df.February and df.March
df.withColumn('TOTAL', df.January + df.February + df.March).show()
Output:
+-------+--------+-----+
|January|February|March|
+-------+--------+-----+
| 10| 10| 10|
| 20| 20| 20|
+-------+--------+-----+
+-------+--------+-----+-----+
|January|February|March|TOTAL|
+-------+--------+-----+-----+
| 10| 10| 10| 30|
| 20| 20| 20| 60|
+-------+--------+-----+-----+
You can also create an User Defined Function it you want, here a link of Spark documentation: UserDefinedFunction (udf)