I am writing a User Defined Function which will take all the columns except the first one in a dataframe and do sum (or any other operation). Now the dataframe can sometimes
Use struct instead of array
from pyspark.sql.types import IntegerType from pyspark.sql.functions import udf, struct sum_cols = udf(lambda x: x[0]+x[1], IntegerType()) a=spark.createDataFrame([(101, 1, 16)], ['ID', 'A', 'B']) a.show() a.withColumn('Result', sum_cols(struct('A', 'B'))).show()