Facing issue in Hive query in generating missing dates

后端 未结 1 665
梦如初夏
梦如初夏 2020-12-11 21:54

I have a requirement where I need to go back to previous values for a column until 1000 rows and get those previous 1000 dates for my next steps, but all those 1000 previous

相关标签:
1条回答
  • 2020-12-11 22:39

    You can generate dates for required range in the subquery (see date_range subquery in the example below) and left join it with your table. If there is no record in your table on some dates, the value will be null, dates will be returned from the date_range subquery without gaps. Set start_date and end_date parameters for date_range required:

    set hivevar:start_date=2016-04-23; --replace with your start_date
    set hivevar:end_date=current_date; --replace with your end_date
    
    set hive.exec.parallel=true;
    set hive.auto.convert.join=true; --this enables map-join
    set hive.mapjoin.smalltable.filesize=25000000; --size of table to fit in memory
    
    with date_range as 
    (--this query generates date range, check it's output
    select date_add ('${hivevar:start_date}',s.i) as dt 
      from ( select posexplode(split(space(datediff(${hivevar:end_date},'${hivevar:start_date}')),' ')) as (i,x) ) s
    ) 
    
    select d.dt as date,
           t.your_col --some value from your table on date
      from date_range d 
           left join table1 t on d.dt=t.date 
    order by d.dt --order by dates if necessary
    
    0 讨论(0)
提交回复
热议问题