Best way to get the max value in a Spark dataframe column

后端 未结 13 1100
一整个雨季
一整个雨季 2020-12-07 10:27

I\'m trying to figure out the best way to get the largest value in a Spark dataframe column.

Consider the following example:

df = spark.createDataFra         


        
13条回答
  •  悲哀的现实
    2020-12-07 10:49

    I used another solution (by @satprem rath) already present in this chain.

    To find the min value of age in the dataframe:

    df.agg(min("age")).show()
    
    +--------+
    |min(age)|
    +--------+
    |      29|
    +--------+
    

    edit: to add more context.

    While the above method printed the result, I faced issues when assigning the result to a variable to reuse later.

    Hence, to get only the int value assigned to a variable:

    from pyspark.sql.functions import max, min  
    
    maxValueA = df.agg(max("A")).collect()[0][0]
    maxValueB = df.agg(max("B")).collect()[0][0]
    

提交回复
热议问题