Best way to get the max value in a Spark dataframe column

后端 未结 13 1054
一整个雨季
一整个雨季 2020-12-07 10:27

I\'m trying to figure out the best way to get the largest value in a Spark dataframe column.

Consider the following example:

df = spark.createDataFra         


        
13条回答
  •  -上瘾入骨i
    2020-12-07 10:35

    import org.apache.spark.sql.SparkSession
    import org.apache.spark.sql.functions._
    
    val testDataFrame = Seq(
      (1.0, 4.0), (2.0, 5.0), (3.0, 6.0)
    ).toDF("A", "B")
    
    val (maxA, maxB) = testDataFrame.select(max("A"), max("B"))
      .as[(Double, Double)]
      .first()
    println(maxA, maxB)
    

    And the result is (3.0,6.0), which is the same to the testDataFrame.agg(max($"A"), max($"B")).collect()(0).However, testDataFrame.agg(max($"A"), max($"B")).collect()(0) returns a List, [3.0,6.0]

提交回复
热议问题