问题
I have a requirement where I need to go back to previous values for a column until 1000 rows and get those previous 1000 dates for my next steps , but all those 1000 previous dates are not present for that column in the table. But I need those missing dates to get from output of the query . When I try to run below query it is not displaying 1000 previous date values from current date
Example lets say only 2 dates are available for date column
date
2019-01-16
2019-01-19
I have come up with a query to get back 1000 dates but it is giving only nearest date as all previous back dates are missing
SELECT date FROM table1 t
WHERE
date >= date_sub(current_date,1000) and dt<current_date ORDER BY date LIMIT 1
If I run above query it is displaying 2019-01-16, since previous 1000 days back date are not present it is giving nearest date ,which is 2019-01-16 but I need missing dates starting from 2016-04-23 (1000th date from current date) till before current date (2019-01-18) as output of my query. Could you please help on this.
回答1:
You can generate dates for required range in the subquery (see date_range subquery in the example below) and left join it with your table. If there is no record in your table on some dates, the value will be null, dates will be returned from the date_range subquery without gaps. Set start_date and end_date parameters for date_range required:
set hivevar:start_date=2016-04-23; --replace with your start_date
set hivevar:end_date=current_date; --replace with your end_date
set hive.exec.parallel=true;
set hive.auto.convert.join=true; --this enables map-join
set hive.mapjoin.smalltable.filesize=25000000; --size of table to fit in memory
with date_range as
(--this query generates date range, check it's output
select date_add ('${hivevar:start_date}',s.i) as dt
from ( select posexplode(split(space(datediff(${hivevar:end_date},'${hivevar:start_date}')),' ')) as (i,x) ) s
)
select d.dt as date,
t.your_col --some value from your table on date
from date_range d
left join table1 t on d.dt=t.date
order by d.dt --order by dates if necessary
来源:https://stackoverflow.com/questions/54265149/facing-issue-in-hive-query-in-generating-missing-dates