Add column sum as new column in PySpark dataframe

前端 未结 8 2012
粉色の甜心
粉色の甜心 2020-12-02 22:43

I\'m using PySpark and I have a Spark dataframe with a bunch of numeric columns. I want to add a column that is the sum of all the other columns.

Suppose my datafram

8条回答
  •  伪装坚强ぢ
    2020-12-02 23:11

    The following approach works for me:

    1. Import pyspark sql functions
      from pyspark.sql import functions as F
    2. Use F.expr(list_of_columns)
      data_frame.withColumn('Total_Sum',F.expr('col_name1+col_name2+..col_namen)

提交回复
热议问题