Oozie

Apache Oozie failed loading ShareLib

早过忘川 提交于 2019-11-28 05:12:22
问题 i got the following oozie.log : org.apache.oozie.service.ServiceException: E0104: Could not fully initialize service [org.apache.oozie.service.ShareLibService], Not able to cache sharelib. An Admin needs to install the sharelib with oozie-setup.sh and issue the 'oozie admin' CLI command to update the sharelib i run the following command: oozie-setup.sh sharelib create -fs hdfs://localhost:54310 oozied.sh start hdfs dfs -ls /user/hduser/share/lib 15/02/24 18:05:03 WARN util.NativeCodeLoader:

Hadoop集群配置Hue

家住魔仙堡 提交于 2019-11-28 04:34:00
Hue是一个轻量级的Web服务器,可让您 直接从浏览器 使用Hadoop 。 Hue只是一个“在任何Hadoop发行版之上的视图”,可以安装在任何机器上。 官方文档在 官方文档 有多种方式(比如 gethue.com的 “下载”部分 )安装Hue。 下一步就是将Hue配置为指向您的 Hadoop集群 。 默认情况下,Hue假定存在一个本地集群(即只有一台机器)。 为了与真正的集群进行交互,Hue需要知道哪些主机分配了Hadoop服务。 hue.ini在哪里?(配置文件) hue主要配置发生在 hue.ini 文件中。 它列出了很多选项,但本质上什么是HDFS,YARN,Oozie,Hive的地址和端口...根据您安装的ini文件的分布位于: CDH 包: /etc/hue/conf/hue.ini tarball 版本: /usr/share/desktop/conf/hue.ini 开发版本: desktop/conf/pseudo-distributed.ini Cloudera Manager : CM 为你生成所有的hue.ini,所以没有麻烦? /var/run/cloudera-scm-agent/process/`ls -alrt /var/run/cloudera-scm-agent/process | grep HUE | tail -1 | awk ‘

Override hadoop's mapreduce.fileoutputcommitter.marksuccessfuljobs in oozie

时光怂恿深爱的人放手 提交于 2019-11-28 04:12:48
问题 <property> <name>mapreduce.fileoutputcommitter.marksuccessfuljobs</name> <value>false</value> </property> I want to override the above property to true. The property needs to be false for the rest of the jobs on the cluster, but I need, in my oozie workflow, hadoop to create _SUCCESS file in the output directory after the completion of job. Its a hive action in the workflow which writes output. Please help. 回答1: Hive unfortunately overrides this capability by setting it's own

Job queue for Hive action in oozie

丶灬走出姿态 提交于 2019-11-28 02:21:33
I have a oozie workflow. I am submitting all the hive actions with <name>mapred.job.queue.name</name> <value>${queueName}</value> But for few hive actions, the job launched is not in specified queue; it is invoked in default queue. Please suggest me the cause behind this behavior and solution. Samson Scharfrichter A. Oozie specifics Oozie propagates the "regular" Hadoop properties to a "regular" MapReduce Action. But for other types of Action (Shell, Hive, Java, etc.) where Oozie runs a single Mapper task in YARN, it does not consider that it's a real MapReduce job. Hence it uses a different

Error on running multiple Workflow in OOZIE-4.1.0

谁说我不能喝 提交于 2019-11-28 00:06:48
I installed oozie 4.1.0 on a Linux machine by following the steps at http://gauravkohli.com/2014/08/26/apache-oozie-installation-on-hadoop-2-4-1/ hadoop version - 2.6.0 maven - 3.0.4 pig - 0.12.0 Cluster Setup - MASTER NODE runnig - Namenode, Resourcemanager ,proxyserver. SLAVE NODE running -Datanode,Nodemanager. When I run single workflow job means it succeeds. But when I try to run more than one Workflow job i.e. both the jobs are in accepted state Inspecting the error log, I drill down the problem as, 014-12-24 21:00:36,758 [JobControl] INFO org.apache.hadoop.ipc.Client - Retrying connect

Oozie shell action memory limit

和自甴很熟 提交于 2019-11-27 22:35:05
We have an oozie workflow with a shell action that needs more memory than what a map task is given by Yarn by default. How can we give it more memory? We have tried adding the following configuration to the action: <configuration> <property> <name>mapreduce.map.memory.mb</name> <value>6144</value> <!-- for example --> </property> </configuration> We have both set this as an inline (in the workflow.xml) configuration and as a jobXml. Neither has had any effect. We found the answer: A shell action is executed as an oozie "launcher" map task, and this task does not use the normal configuration

Oozie + Sqoop: JDBC Driver Jar Location

佐手、 提交于 2019-11-27 18:33:25
问题 I have a 6 node cloudera based hadoop cluster and I'm trying to connect to an oracle database from a sqoop action in oozie. I have copied my ojdbc6.jar into the sqoop lib location (which for me happens to be at: /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/sqoop/lib/ ) on all the nodes and have verified that I can run a simple 'sqoop eval' from all the 6 nodes. Now when I run the same command using Oozie's sqoop action, I get "Could not load db driver class: oracle.jdbc.OracleDriver"

sqoop1 导出与hue oozie踩坑

扶醉桌前 提交于 2019-11-27 17:19:42
可能是不同版本不同吧,按网友的最终改为: export --connect jdbc:mysql:// 172.16.5.100:3306/dw_test --username testuser --password ****** --table che100kv --export-dir /user/hive/warehouse/che100kv0/000000_0 --input-fields-terminated-by \001 -m 1 报错: Error during export: Export job failed! at org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJobBase.java:439) at org.apache.sqoop.manager.SqlManager.exportTable(SqlManager.java:931) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java: <<< Invocation of Sqoop command completed <<< Hadoop Job IDs executed by Sqoop: job_1534936991079_0934 Intercepting

launching a spark program using oozie workflow

て烟熏妆下的殇ゞ 提交于 2019-11-27 16:47:57
问题 I am working with a scala program using spark packages. Currently I run the program using the bash command from the gateway: /homes/spark/bin/spark-submit --master yarn-cluster --class "com.xxx.yyy.zzz" --driver-java-options "-Dyyy.num=5" a.jar arg1 arg2 I would like to start using oozie for running this job. I have a few setbacks: Where should I put the spark-submit executable? on the hfs? How do I define the spark action? where should the --driver-java-options appear? How should the oozie

Getting E0902: Exception occured: [User: oozie is not allowed to impersonate oozie]

微笑、不失礼 提交于 2019-11-27 15:18:38
问题 Hi i am new to Oozie and i am getting this error E0902: Exception occured: [User: pramod is not allowed to impersonate pramod] when i run the following command ./oozie job -oozie htt p://localhost:11000/oozie/ -config ~/Desktop/map-reduce /job.properties -run. My hadoop version is 1.0.3 and oozie version is 3.3.2 and running in a pseudo mode The following is the content of my core-site.xml <configuration> <property> <name>hadoop.tmp.dir</name> <value>/home/pramod/hadoop-${user.name}</value> <