SparkSQL - Lag function?

浪子不回头ぞ 提交于 2019-11-29 11:28:14
  1. Frame specification should start with a keyword ROWS not ROW
  2. Frame specification requires either lower bound value

    ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
    

    or UNBOUNDED keyword

    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    
  3. LAG function doesn't accept frame at all so a correct SQL query with lag can look like this

    SELECT tx.cc_num,tx.trans_date,tx.trans_time,tx.amt, LAG(tx.amt) OVER (
         PARTITION BY tx.cc_num ORDER BY  tx.trans_date,tx.trans_time
    ) as prev_amt from tx
    

Edit:

Regarding SQL DSL usage:

  1. As you can read in an error message

    Note that, using window functions currently requires a HiveContex

    Be sure to initialize sqlContext using HiveContext not SQLContext

  2. windowSpec.rowsBetween(-1, 0) does nothing, but once again frame specification is not supported by the lag function.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!