Pysaprk multi groupby with different column
问题 I have data like below year name percent sex 1880 John 0.081541 boy 1881 William 0.080511 boy 1881 John 0.050057 boy I need to groupby and count using different columns df_year = df.groupby('year').count() df_name = df.groupby('name').count() df_sex = df.groupby('sex').count() then I have to create a Window to get the top-3 data by each column window = Window.partitionBy('year').orderBy(col("count").desc()) top4_res = df_year.withColumn('topn', func.row_number().over(window)).\ filter(col(