Oozie

MapReduce job Status is stuck in Running State

微笑、不失礼 提交于 2019-12-11 05:33:15
问题 I'm trying to run Mapreduce program from Oozie (4.1.0). But Its status is in RUNNING state and stuck at same status. worklfow.xml <workflow-app xmlns="uri:oozie:workflow:0.4" name="simple-Workflow"> <start to="RunMapreduceJob" /> <action name="RunMapreduceJob"> <map-reduce> <job-tracker>localhost:8088</job-tracker> <name-node>hdfs://localhost:9000</name-node> <prepare> <delete path="hdfs://localhost:9000/dataoutput"/> </prepare> <configuration> <property> <name>mapred.job.queue.name</name>

Multiple Input Paths configuration in OOZIE

杀马特。学长 韩版系。学妹 提交于 2019-12-11 05:15:32
问题 I am trying to configure a Mapreduce job in oozie . This job has two different input formats and two input data folders. I used this post How to configure oozie workflow for multi-input path with multiple mappers and added these properties to my workflow.xml : <property> <name>mapred.input.dir.formats</name> <value>folder/data/*;org.apache.hadoop.mapred.SequenceFileInputFormat\,data/*;org.apache.hadoop.mapred.TextInputFormat</value> </property> <property> <name>mapred.input.dir.mappers</name>

sqoop export fail through oozie

痞子三分冷 提交于 2019-12-11 03:59:06
问题 I am trying to export data to mysq l from hdfs through sqoop . I am able to run sqoop through shell and it is working fine . but when I am invoking through oozie . it is arising following error and getting fail. I have also included jars. there is no desciptive log sqoop script: export --connect jdbc:mysql://localhost/bigdata --username root --password cloudera --verbose --table AGGREGATED_METRICS --input-fields-terminated-by '\0001' --export-dir /bigdata/aggregated_metrics error: Launcher

SparkAction for yarn-cluster

二次信任 提交于 2019-12-11 03:48:23
问题 Using the Hortonworks HDP 2.3 preview sandbox (oozie:4.2.0.2.3.0.0-2130, spark:1.3 and Hadoop:2.7.1.2.3.0.0-2130), I am trying to invoke the oozie spark action using "yarn-cluster" as the master. The example provided in Oozie Spark Action is for running the spark action on "local" master. The same page also suggests to be able to run on Yarn, the spark assembly jar should be available to the spark action. I have two questions How do we make the spark assembly jar available to Spark Action?

Oozie error while trying to execute “bin/mkdistro.sh -DskipTests”

♀尐吖头ヾ 提交于 2019-12-11 02:58:56
问题 Trying to install oozie 4.0.1 following http://www.thecloudavenue.com/2013/10/installation-and-configuration-of.html hadoop version - 2.4.0 maven - 3.0.4 sqoop - 1.4.4 while trying to execute "bin/mkdistro.sh -DskipTests", failed building .......... [INFO] Apache Oozie HCatalog Libs ........................ SUCCESS [0.399s] [INFO] Apache Oozie Core ................................. FAILURE [7.819s] [INFO] Apache Oozie Docs ................................. SKIPPED ......... [ERROR] Failed to

Oozie TimeZone handling for Daylight saving (CRON Expressions)

冷暖自知 提交于 2019-12-11 02:09:16
问题 I have a Oozie application that suppose to respect Daylight savings. However my schedules are complex so I can't represent them as Expression Language functions such as ${coord:days(2)} . Therefore I need to use CRON expressions. One example of schedule is every week on weekdays at 13:30 PM (resulting cron expression for Oozie "30 13 * * 2-6" with no time zone adjustment) America/Los Angeles timezone. I want this schedule to works fine regardless of the DST changes. If I schedule this

Capture Console output of Spark Action Node in Oozie as variable across the Oozie Workflow

ⅰ亾dé卋堺 提交于 2019-12-11 00:26:32
问题 Is there a way to capture the console output of a spark job in Oozie? I want to use the specific printed value in the next action node after the spark job. I was thinking that I could have maybe used the ${wf:actionData("action-id")["Variable"]} but it seems that oozie does not have the capability to capture output from a spark action node unlike in the Shell action you could just use echo "var=12345" and then invoke the wf:actionData in oozie to be used as an Oozie Variable across the

Example Oozie job works from Hue, but not from command line: SparkMain not found

﹥>﹥吖頭↗ 提交于 2019-12-11 00:25:36
问题 I've successfully run the example Spark workflow ("Copy a file by launching a Spark Java program") provided in the Hue Oozie workflow editor (in the Cloudera 5.5.1 QuickStart VM). I'm now trying to run it manually using the oozie commandline tool: oozie job -oozie http://localhost:11000/oozie -config job.properties -run The workflow XML is basically unchanged - I have copied it to HDFS and have the following job.properties : nameNode=hdfs://localhost:8020 jobTracker=localhost:8032 oozie.wf

Sqoop import as Avro data file gives all values as NULL when creating external Avro table in Hive

故事扮演 提交于 2019-12-10 23:29:17
问题 I am trying to import Oracle DB data into HDFS using Sqoop import free-form query by joining two tables using '--as-avrodatafile' using Oozie scheduler. Following is the content of my workflow.xml: <?xml version="1.0" encoding="UTF-8"?> <workflow-app xmlns="uri:oozie:workflow:0.2" name="sqoop-freeform-wf"> <start to="sqoop-freeform-node"/> <action name="sqoop-freeform-node"> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node>

oozie java api submit job, kerberos Authentication error

吃可爱长大的小学妹 提交于 2019-12-10 21:06:30
问题 I hava hadoop-2.7 cluster, oozie-4.0.1 running in secure mode(with kerberos). All are well. I can use cli commands submit job as follow: Kinit myuser oozie job -oozie https://10.1.130.10:21003/oozie -config job.properties -run but I use oozie java api submit job, kerberos exception occur. Exception in thread "main" AUTHENTICATION : Could not authenticate, GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at org.apache.oozie.client.AuthOozieClient