Add column sum as new column in PySpark dataframe

前端 未结 8 2044
粉色の甜心
粉色の甜心 2020-12-02 22:43

I\'m using PySpark and I have a Spark dataframe with a bunch of numeric columns. I want to add a column that is the sum of all the other columns.

Suppose my datafram

8条回答
  •  庸人自扰
    2020-12-02 23:18

    The most straight forward way of doing it is to use the expr function

    from pyspark.sql.functions import *
    data = data.withColumn('total', expr("col1 + col2 + col3 + col4"))
    

提交回复
热议问题