Automating Hive Activity using aws

匿名 (未验证) 提交于 2019-12-03 07:36:14

问题:

I would like to automate my hive script every day , in order to do that i have an option which is data pipeline. But the problem is there that i am exporting data from dynamo-db to s3 and with a hive script i am manipulating this data. I am giving this input and output in hive-script that's where the problem starts because a hive-activity has to have input and output but i have to give them in script file.

I am trying to find a way to automate this hive-script and waiting for some ideas ?

Cheers,

回答1:

You can disable staging on Hive Activity to run any arbitrary Hive Script.

stage = false 

Do something like:

{   "name": "DefaultActivity1",   "id": "ActivityId_1",   "type": "HiveActivity",   "stage": "false",   "scriptUri": "s3://baucket/query.hql",   "scriptVariable": [     "param1=value1",     "param2=value2"   ],   "schedule": {     "ref": "ScheduleId_l"   },   "runsOn": {     "ref": "EmrClusterId_1"   } }, 


易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!