Looking to set a reusable variable in hive

倖福魔咒の 提交于 2020-04-18 05:33:52

问题


I'm looking to set a variable like below, called today_date, and then be able to reuse it as a variable throughout the query. The below throws an error.

set today_date = date_format(date_sub(current_date, 1), 'YYYYMMdd')

select account
from table
where data_date = today_date

回答1:


First command should end with semicolon:

set today_date=date_format(date_sub(current_date, 1), 'YYYYMMdd');

And variable should be used like this:

select account
from table
where data_date=${hivevar:today_date};

set command will not calculate expression and it will be substituted as is. The resulted query will be

select account
from table
where data_date = date_format(date_sub(current_date, 1), 'YYYYMMdd');

If you want variable to be already calculated, then calculate it in a shell and pass to your Hive script like in this answer: https://stackoverflow.com/a/37821218/2700344




回答2:


You still need to put a semicolon at the end of the set line, surround your variable with ${} and use the proper namespace.

Note that this will not execute the date_format() function when the variable is defined. When you use the variable the SQL code will just be copied as-is. Think of it as more as a macro than as a variable.

Furthermore, Hive has multiple variable namespaces. The 2 easiest options are either to be less verbose when you define your variable but to be more verbose when you use it (hiveconf namespace):

set today_date = date_format(date_sub(current_date, 1), 'YYYYMMdd');
select account from table where data_date = ${hiveconf:today_date};

or the other way round (hivevar namespace)

set hivevar:today_date = date_format(date_sub(current_date, 1), 'YYYYMMdd');
select account from table where data_date = ${today_date};


来源:https://stackoverflow.com/questions/51486896/looking-to-set-a-reusable-variable-in-hive

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!