Add column sum as new column in PySpark dataframe

前端 未结 8 2045
粉色の甜心
粉色の甜心 2020-12-02 22:43

I\'m using PySpark and I have a Spark dataframe with a bunch of numeric columns. I want to add a column that is the sum of all the other columns.

Suppose my datafram

8条回答
  •  庸人自扰
    2020-12-02 23:11

    A very simple approach would be to just use select instead of withcolumn as below:

    df = df.select('*', (col("a")+col("b")+col('c).alias("total"))

    This should give you required sum with minor changes based on requirements

提交回复
热议问题