Oozie

How to schedule a sqoop action using oozie

久未见 提交于 2019-12-04 03:58:28
问题 I am new to Oozie, Just wondering - How do I schedule a sqoop job using Oozie. I know sqoop action can be added as part of the Oozie workflow. But how can I schedule a sqoop action and get it running like every 2 mins or 8pm every day automatically (just lie a cron job)? 回答1: You need to create coordinator.xml file with start, end and frequency. Here is an example <coordinator-app name="example-coord" xmlns="uri:oozie:coordinator:0.2" frequency="${coord:days(7)}" start="${start}" end= "${end}

org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many counters: 121 max=120

心已入冬 提交于 2019-12-04 03:29:45
问题 I'm running a hadoop job ( from oozie ) that has few counters, and multioutput. I get error like: org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many counters: 121 max=120 Then I removed all the code that has counters, and also set mout.setCountersEnabled to false. And also set the max counters to 240 in hadoop config. Now I still get the same error org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many counters: 241 max=240 How can I solve this problem? Is

oozie的作业调度

旧时模样 提交于 2019-12-03 18:58:29
一段时间写了个mr程序,最后要进行作业调度,但是不知道用什么方式比较合适,最终选择了oozie。 之前一直写web程序,老板们突然让玩 Hadoop ,于是就这么愉快的的接受了这个活,对于一个新手来说其中遇到好多好多的坑。。。 环境:hadoop :1.2.1, sqoop:1.4.4, oozie:3.3.2 1. oozie安装请参考我的这篇文章: http://blog.csdn.net/jueshengtianya/article/details/25300761 这里面有我之前遇到的坑。 2. oozie的workflow去找要运行的jar包是在的他的同级目录下的lib目录下,workflow要找依赖的jar包都是在这个路径下。 3. 我的oozie工作目录: [java] view plain copy hadoop @steven :~/hadoop1.1.2/hadoop-1.2.1/iesRunShell/oozie/iesCron$ ../../../bin/hadoop fs -ls /ies/oozie/cron/ Found 4 items -rw-r--r-- 3 hadoop supergroup 1591 2014-05-12 19:37 /ies/oozie/cron/coordinator.xml -rw-r--r-- 3 hadoop

E0701 XML schema error in OOZIE workflow

匿名 (未验证) 提交于 2019-12-03 08:59:04
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: The following is my workflow.xml <workflow-app xmlns="uri:oozie:workflow:0.3" name="import-job"> <start to="createtimelinetable" /> <action name="createtimelinetable"> <sqoop xmlns="uri:oozie:sqoop-action:0.3"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <configuration> <property> <name>mapred.compress.map.output</name> <value>true</value> </property> </configuration> <command>import --connect jdbc:mysql://10.65.220.75:3306/automation --table ABC --username root</command> </sqoop> <ok to="end"/> <error to="end

IOException: Filesystem closed exception when running oozie workflow

旧城冷巷雨未停 提交于 2019-12-03 07:50:36
问题 We are running a workflow in oozie. It contains two actions: the first is a map reduce job that generates files in the hdfs and the second is a job that should copy the data in the files to a database. Both parts are done successfully but oozie throws an exception at the end that marks it as a failed process. This is the exception: 2014-05-20 17:29:32,242 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:lpinsight (auth:SIMPLE) cause:java.io.IOException:

Oozie workflow: Hive table not found but it does exist

倖福魔咒の 提交于 2019-12-03 07:40:20
I got a oozie workflow, running on a CDH4 cluster of 4 machines (one master-for-everything, three "dumb" workers). The hive metastore runs on the master using mysql (driver is present), the oozie server also runs on the master using mysql, too. Using the web interface I can import and query hive as expected, but when I do the same queries within an oozie workflow it fails. Even the addition of the "IF EXISTS" leads to the error below. I tried to add the connection information as properties to the hive job without any success. Can anybody give me a hint? Did I miss anything? Any further

Oozie SSH Action

北城以北 提交于 2019-12-03 06:56:10
Oozie SSH Action Issue: Issue: We are trying to run few commands on a particular host machine of our cluster. We chose SSH Action for the same. We have been facing this SSH issue for some time now. What might be the real issue here? Please point me towards the solution. logs: AUTH_FAILED: Not able to perform operation [ssh -o PasswordAuthentication=no -o KbdInteractiveDevices=no -o StrictHostKeyChecking=no -o ConnectTimeout=20 USER@1.2.3.4 mkdir -p oozie-oozi/0000000-131008185935754-oozie-oozi-W/action1--ssh/ ] | ErrorStream: Warning: Permanently added host,1.2.3.4 (RSA) to the list of known

Oozie: Launch Map-Reduce from Oozie &lt;java&gt; action?

匿名 (未验证) 提交于 2019-12-03 03:06:01
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I am trying to execute a Map-Reduce task in an Oozie workflow using a action. O'Reilley's Apache Oozie (Islam and Srinivasan 2015) notes that: While it’s not recommended, Java action can be used to run Hadoop MapReduce jobs because MapReduce jobs are nothing but Java programs after all. The main class invoked can be a Hadoop MapReduce driver and can call Hadoop APIs to run a MapReduce job. In that mode, Hadoop spawns more mappers and reducers as required and runs them on the cluster. However, I'm not having success using this approach. The

Problems with starting Oozie workflow

匿名 (未验证) 提交于 2019-12-03 02:42:02
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I have a problem starting a Oozie workflow: Config: <workflow-app name="Hive" xmlns="uri:oozie:workflow:0.4"> <start to="Hive"/> <action name="Hive"> <hive xmlns="uri:oozie:hive-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <configuration> <property> <name>oozie.hive.defaults</name> <value>hive-default.xml</value> </property> </configuration> <script>/user/hue/oozie/workspaces/hive/hive.sql</script> <param>INPUT_TABLE=movieapp_log_json</param> <param>OUTPUT=/user/hue/oozie/workspaces/output</param>

Getting E0902: Exception occured: [User: oozie is not allowed to impersonate oozie]

匿名 (未验证) 提交于 2019-12-03 01:59:02
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: Hi i am new to Oozie and i am getting this error E0902: Exception occured: [User: pramod is not allowed to impersonate pramod] when i run the following command ./oozie job -oozie htt p://localhost:11000/oozie/ -config ~/Desktop/map-reduce /job.properties -run. My hadoop version is 1.0.3 and oozie version is 3.3.2 and running in a pseudo mode The following is the content of my core-site.xml hadoop.tmp.dir /home/pramod/hadoop-${user.name} fs.default.name hdfs://localhost:54310 hadoop.proxyuser.${user.name}.hosts * hadoop.proxyuser.${user.name}