What is the difference between sort and orderBy functions in Spark

别来无恙 提交于 2019-12-20 18:03:47

问题


What is the difference between sort and orderBy spark DataFrame?

scala> zips.printSchema
root
 |-- _id: string (nullable = true)
 |-- city: string (nullable = true)
 |-- loc: array (nullable = true)
 |    |-- element: double (containsNull = true)
 |-- pop: long (nullable = true)
 |-- state: string (nullable = true)

Below commands produce same result:

zips.sort(desc("pop")).show
zips.orderBy(desc("pop")).show

回答1:


OrderBy is just an alias for the sort function.

From the Spark documentation:

  /**
   * Returns a new Dataset sorted by the given expressions.
   * This is an alias of the `sort` function.
   *
   * @group typedrel
   * @since 2.0.0
   */
  @scala.annotation.varargs
  def orderBy(sortCol: String, sortCols: String*): Dataset[T] = sort(sortCol, sortCols : _*)


来源:https://stackoverflow.com/questions/40603202/what-is-the-difference-between-sort-and-orderby-functions-in-spark

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!