问题
With Spark SQL's window functions, I need to partition by multiple columns to run my data queries, as follows:
val w = Window.partitionBy($"a").partitionBy($"b").rangeBetween(-100, 0)
I currently do not have a test environment (working on settings this up), but as a quick question, is this currently supported as a part of Spark SQL's window functions, or will this not work?
回答1:
This won't work. The second partitionBy
will overwrite the first one. Both partition columns have to be specified in the same call:
val w = Window.partitionBy($"a", $"b").rangeBetween(-100, 0)
来源:https://stackoverflow.com/questions/37795488/partitioning-by-multiple-columns-in-spark-sql