Oozie

Piping data into jobs in Hadoop MR/Pig

我是研究僧i 提交于 2020-01-07 04:31:25
问题 I have three different type of jobs running on the data in HDFS. These three jobs have to be run separately in the current scenario. Now, we want to run the three jobs together by piping the OUTPUT data of one job to the other job without writing the data in HDFS to improve the architecture and overall performance. Any suggestions are welcome for this scenario. PS : Oozie is not fitting for the workflow.Cascading framework is also ruled out because of Scalability issues. Thanks 回答1: Hadoop

service specific users not created in cloudera

守給你的承諾、 提交于 2020-01-07 02:02:12
问题 I did not face any porblems while installing cloudera but I just realized that I should have had users like oozie and hdfs created on my centos machine, I guess under /home directory? But I do not have any such users under home directory and I am not able to login as oozie user through su oozie command. Is it an installation problem or is there some other way to do it? Now, that I am trying to copy a jar in oozie sharelib folder, it does not allow so through root user and I do not see any

Make Oozie do not change CLASSPATH of java action

筅森魡賤 提交于 2020-01-05 03:34:32
问题 I'm running java application in oozie and oozie adding something to classpath. How do I know? When I run this application without oozie it works perfectly fine, but with oozie I get java.lang.NoSuchMethodError: org.apache.hadoop.yarn.webapp.util.WebAppUtils.getProxyHostsAndPortsForAmFilter(Lorg/apache/hadoop/conf/Configuration;)Ljava/util/List; at org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer.initFilter(AmFilterInitializer.java:40) at org.apache.hadoop.http.HttpServer.

Oozie - Hadoop commands are not executing (Shell)

巧了我就是萌 提交于 2020-01-04 05:30:55
问题 I am running a shell script that has hadoop commands. Getting the following error when executing the same Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1] I am running a simple shell script with Cloudera Hue - Oozie However when the script has no hadoop commands, it gets executed sucessfully. I have set oozie.use.system.libpath=true and could see my libs are in user/oozie/share/lib/<lib_timestmap> Below is the shell script I am trying to run #! /bin/bash $(hadoop fs -mkdir

Oozie shell script action

吃可爱长大的小学妹 提交于 2020-01-03 15:29:24
问题 I am exploring the capabilities of Oozie for managing Hadoop workflows. I am trying to set up a shell action which invokes some hive commands. My shell script hive.sh looks like: #!/bin/bash hive -f hivescript Where the hive script (which has been tested independently) creates some tables and so on. My question is where to keep the hivescript and then how to reference it from the shell script. I've tried two ways, first using a local path, like hive -f /local/path/to/file , and using a

Imported Failed: Cannot convert SQL type 2005==> during importing CLOB data from Oracle database

筅森魡賤 提交于 2020-01-02 23:13:15
问题 I am trying to import a Oracle table's data with CLOB data type using sqoop and it is failing with the error Imported Failed: Cannot convert SQL type 2005 . I am using Running Sqoop version: 1.4.5-cdh5.4.7 . Please help me how to import CLOB data type. I am using the below oozie workflow to import the data <workflow-app xmlns="uri:oozie:workflow:0.4" name="EBIH_Dly_tldb_dly_load_wf"> <credentials> <credential name="hive2_cred" type="hive2"> <property> <name>hive2.jdbc.url</name> <value>$

Workflow error logs disabled in Oozie 4.2

半城伤御伤魂 提交于 2020-01-02 03:41:37
问题 I am using Oozie 4.2 that comes bundled with HDP 2.3. while working with a few example workflow's that comes with the oozie package, I noticed that the "job error log is disabled" and this makes debugging really difficult in the event of a failure. I tried running the below commands, # oozie job -config /home/santhosh/examples/apps/hive/job.properties -run job: 0000063-150904123805993-oozie-oozi-W # oozie job -errorlog 0000063-150904123805993-oozie-oozi-W Error Log is disabled!! Can someone

sqoop job shell script execute parallel in oozie

喜你入骨 提交于 2019-12-30 14:49:44
问题 I have a shell script which executes sqoop job . The script is below. !#/bin/bash table=$1 sqoop job --exec ${table} Now when I pass the table name in the workflow I get the sqoop job to be executed successfully. The workflow is below. <workflow-app name="Shell_script" xmlns="uri:oozie:workflow:0.5"> <start to="shell"/> <kill name="Kill"> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <action name="shell_script"> <shell xmlns="uri:oozie:shell

Oozie/yarn: resource changed on src filesystem

蹲街弑〆低调 提交于 2019-12-30 10:17:10
问题 I have an Oozie workflow, with one of its step being a java step, running a jar stored on the local filesystem (the jar is present on all nodes). Initially, the jar was installed via a RPM, so they all have the same timestamp. While experimenting, I manually copied a new version over this jar, and I now get the message: org.apache.oozie.action.ActionExecutorException: JA009: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1516602562532_15451 to YARN : Application

sqoop fails to store incremental state to the metastore

馋奶兔 提交于 2019-12-29 09:13:10
问题 I get this on saving incremental import state 16/05/15 21:43:05 INFO tool.ImportTool: Saving incremental import state to the metastore 16/05/15 21:43:56 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Error communicating with database at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.createInternal(HsqldbJobStorage.java:426) at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.update(HsqldbJobStorage.java:445) at org.apache.sqoop.tool.ImportTool