I\'m trying to figure out the best way to get the largest value in a Spark dataframe column.
Consider the following example:
df = spark.createDataFra
I used another solution (by @satprem rath) already present in this chain.
To find the min value of age in the dataframe:
df.agg(min("age")).show()
+--------+
|min(age)|
+--------+
| 29|
+--------+
edit: to add more context.
While the above method printed the result, I faced issues when assigning the result to a variable to reuse later.
Hence, to get only the int value assigned to a variable:
from pyspark.sql.functions import max, min
maxValueA = df.agg(max("A")).collect()[0][0]
maxValueB = df.agg(max("B")).collect()[0][0]