Pyspark dafaframe OrderBy list of columns [duplicate]

问题

I am trying to use OrderBy function in pyspark dataframe before I write into csv but I am not sure to use OrderBy functions if I have a list of columns.

Code:

Cols = ['col1','col2','col3']
df = df.OrderBy(cols,ascending=False)

回答1:

As per docstring / signature:

Signature: df.orderBy(*cols, **kwargs)
Docstring:
Returns a new :class:`DataFrame` sorted by the specified column(s).
:param cols: list of :class:`Column` or column names to sort by.
:param ascending: boolean or list of boolean (default True).

Both

df = spark.createDataFrame([(1, 2, 3)] )
cols = ["_1", "_2", "_3"]

df.orderBy(cols, ascending=False)

and

df.orderBy(*cols, ascending=False)

are valid, as well as equivalents with list[pyspark.sql.Column].

来源：https://stackoverflow.com/questions/50783515/pyspark-dafaframe-orderby-list-of-columns

标签

python-3.x

apache-spark

pyspark

apache-spark-sql

sql-order-by

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!