I have a requirement where I need to go back to previous values for a column until 1000 rows and get those previous 1000 dates for my next steps, but all those 1000 previous
You can generate dates for required range in the subquery (see date_range
subquery in the example below) and left join
it with your table. If there is no record in your table on some dates, the value will be null, dates will be returned from the date_range
subquery without gaps. Set start_date
and end_date
parameters for date_range required:
set hivevar:start_date=2016-04-23; --replace with your start_date
set hivevar:end_date=current_date; --replace with your end_date
set hive.exec.parallel=true;
set hive.auto.convert.join=true; --this enables map-join
set hive.mapjoin.smalltable.filesize=25000000; --size of table to fit in memory
with date_range as
(--this query generates date range, check it's output
select date_add ('${hivevar:start_date}',s.i) as dt
from ( select posexplode(split(space(datediff(${hivevar:end_date},'${hivevar:start_date}')),' ')) as (i,x) ) s
)
select d.dt as date,
t.your_col --some value from your table on date
from date_range d
left join table1 t on d.dt=t.date
order by d.dt --order by dates if necessary