I am working on a PySpark DataFrame with n columns. I have a set of m columns (m < n) and my task is choose the column with max values in it.
For example:
You can also use the pyspark built-in least:
from pyspark.sql.functions import least, col df = df.withColumn('min', least(col('c1'), col('c2'), col('c3')))