Get the sysdate -1 in Hive

妖精的绣舞 提交于 2019-12-07 05:06:02

问题


Is there any way to get the current date -1 in Hive means yesterdays date always? And in this format- 20120805?

I can run my query like this to get the data for yesterday's date as today is Aug 6th-

select * from table1 where dt = '20120805';

But when I tried doing this way with date_sub function to get the yesterday's date as the below table is partitioned on date(dt) column.

select * from table1 where dt = date_sub(TO_DATE(FROM_UNIXTIME(UNIX_TIMESTAMP(),
'yyyyMMdd')) , 1)     limit 10;

It is looking for the data in all the partitions? Why? Something wrong I am doing in my query?

How I can make the evaluation happen in a subquery to avoid the whole table scanned?


回答1:


Try something like:

select * from table1 
where dt >= from_unixtime(unix_timestamp()-1*60*60*24, 'yyyyMMdd');

This works if you don't mind that hive scans the entire table. from_unixtime is not deterministic, so the query planner in Hive won't optimize for you. For many cases (for example log files), not specifying a deterministic partition key can cause a very large hadoop job to start since it will scan the whole table, not just the rows with the given partition key.

If this matters to you, you can launch hive with an additional option

$ hive -hiveconf date_yesterday=20150331

And in the script or hive terminal use

select * from table1
where dt >= ${hiveconf:date_yesterday};

The name of the variable doesn't matter, nor does the value, you can set them in this case to get the prior date using unix commands. In the specific case of the OP

$ hive -hiveconf date_yesterday=$(date --date yesterday "+%Y%m%d")



回答2:


In mysql:

select DATE_FORMAT(curdate()-1,'%Y%m%d');

In sqlserver :

SELECT convert(varchar,getDate()-1,112)

Use this query:

SELECT FROM_UNIXTIME(UNIX_TIMESTAMP()-1*24*60*60,'%Y%m%d');



回答3:


It looks like DATE_SUB assumes date in format yyyy-MM-dd. So you might have to do some more format manipulation to get to your format. Try this:

select * from table1 
where dt =  FROM_UNIXTIME(
                UNIX_TIMESTAMP(
                    DATE_SUB(
                        FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd')
                    , 1)
                )
            , 'yyyyMMdd')     limit 10;



回答4:


Use this:

select * from table1 where dt = date_format(concat(year(date_sub(current_timestamp,1)),'-', month(date_sub(current_timestamp,1)), '-', day(date_sub(current_timestamp,1))), 'yyyyMMdd') limit 10;

This will give a deterministic result (a string) of your partition.

I know it's super verbose.



来源:https://stackoverflow.com/questions/11833701/get-the-sysdate-1-in-hive

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!