How to use different window specification per column values?

吃可爱长大的小学妹 提交于 2019-12-24 09:48:14

问题


This is my partitionBy condition which i need to change based on the column value from the data frame .

val windowSpec = Window.partitionBy("col1", "clo2","clo3").orderBy($"Col5".desc) 

Now if the value of the one of the column (col6) in data frame is I then above condition .

But when the value of the column(col6) changes O then below condition

val windowSpec = Window.partitionBy("col1","clo3").orderBy($"Col5".desc)

How can i implement it in the spark data frame .

So it is like for each record it will check whether col6 is I or O based on that partitionBy condition will be applied


回答1:


Given the requirement to select the final window specification based on the values of col6 column, I'd do filter first followed by the final window aggregation.

scala> dataset.show
+----+----+----+----+----+
|col1|col2|col3|col5|col6|
+----+----+----+----+----+
|   0|   0|   0|   0|   I| // <-- triggers 3 columns to use
|   0|   0|   0|   0|   O| // <-- the aggregation should use just 2 columns
+----+----+----+----+----+

With the above dataset, I'd filter out to see if there's at least one I in col6 and apply the window specification.

val windowSpecForIs = Window.partitionBy("col1", "clo2","clo3").orderBy($"Col5".desc)
val windowSpecForOs = Window.partitionBy("col1","clo3").orderBy($"Col5".desc)

val noIs = dataset.filter($"col6" === "I").take(1).isEmpty
val windowSpec = if (noIs) windowSpecForOs else windowSpecForIs


来源:https://stackoverflow.com/questions/47387390/how-to-use-different-window-specification-per-column-values

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!