I\'m trying to figure out the best way to get the largest value in a Spark dataframe column.
Consider the following example:
df = spark.createDataFra
I believe the best solution will be using head()
Considering your example:
+---+---+
| A| B|
+---+---+
|1.0|4.0|
|2.0|5.0|
|3.0|6.0|
+---+---+
Using agg and max method of python we can get the value as following :
from pyspark.sql.functions import max
df.agg(max(df.A)).head()[0]
This will return:
3.0
Make sure you have the correct import:
from pyspark.sql.functions import max
The max function we use here is the pySPark sql library function, not the default max function of python.