SparkSQL: conditional sum using two columns
问题 I hope you can help me with this. I have a DF as follows: val df = sc.parallelize(Seq( (1, "a", "2014-12-01", "2015-01-01", 100), (2, "a", "2014-12-01", "2015-01-02", 150), (3, "a", "2014-12-01", "2015-01-03", 120), (4, "b", "2015-12-15", "2015-01-01", 100) )).toDF("id", "prodId", "dateIns", "dateTrans", "value") .withColumn("dateIns", to_date($"dateIns") .withColumn("dateTrans", to_date($"dateTrans")) I would love to do a groupBy prodId and aggregate 'value' summing it for ranges of dates