Oozie

Python subprocess with oozie

点点圈 提交于 2019-12-07 23:58:21
问题 I'm trying to use subprocess in a python script which I call within an oozie shell action. Subprocess is supposed to read a file which is stored in Hadoop's HDFS. I'm using hadoop-1.2.1 in pseudo-distributed mode and oozie-3.3.2. Here is the python script, named connected_subprocess.py : #!/usr/bin/python import subprocess import networkx as nx liste=subprocess.check_output("hadoop fs -cat /user/root/output-data/calcul-proba/final.txt",shell=True).split('\n') G=nx.DiGraph() f=open("/home/rlk

E0701 XML schema error in OOZIE workflow

空扰寡人 提交于 2019-12-07 20:24:04
问题 The following is my workflow.xml <workflow-app xmlns="uri:oozie:workflow:0.3" name="import-job"> <start to="createtimelinetable" /> <action name="createtimelinetable"> <sqoop xmlns="uri:oozie:sqoop-action:0.3"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <configuration> <property> <name>mapred.compress.map.output</name> <value>true</value> </property> </configuration> <command>import --connect jdbc:mysql://10.65.220.75:3306/automation --table ABC --username

oozie Sqoop action fails to import data to hive

前提是你 提交于 2019-12-07 17:58:27
问题 I am facing issue while executing oozie sqoop action. In logs I can see that sqoop is able to import data to temp directory then sqoop creates hive scripts to import data. It fails while importing temp data to hive. In logs I am not getting any exception. Below is a sqoop action I am using. <workflow-app name="testSqoopLoadWorkflow" xmlns="uri:oozie:workflow:0.4"> <credentials> <credential name='hive_credentials' type='hcat'> <property> <name>hcat.metastore.uri</name> <value>${HIVE_THRIFT_URL

How to specify multiple jar files in oozie

◇◆丶佛笑我妖孽 提交于 2019-12-07 12:15:55
问题 I need a solution for the following problem: My project has two jars in which one jar contains all bean classes like Employee etc, and the other jar contains MR jobs which uses the first jar bean class so when iam trying to run the MR job as a simple java program i am facing the issue of class not found (com.abc.Employee class not found as it is in another jar) so can any one provide me the solution how to solve the issue .... as in real time there may be many jars not 1 or 2 how to specify

JA017: Could not lookup launched hadoop Job ID

倾然丶 夕夏残阳落幕 提交于 2019-12-07 10:42:25
问题 How can I solve this problem when I submit a mapreduce job in Oozie Editor in Hue? : JA017: Could not lookup launched hadoop Job ID [job_local152843681_0009] which was associated with action [0000009-150711083342968-oozie-root-W@mapreduce-f660]. Failing this action! UPDATE: Here are log file: 2015-07-15 04:54:40,304 INFO ActionStartXCommand:520 - SERVER[myserver] USER[root] GROUP[-] TOKEN[] APP[My_Workflow] JOB[0000010-150711083342968-oozie-root-W] ACTION[0000010-150711083342968-oozie-root-W@

Error: E0902: Exception occured: [User: Root is not allowed to impersonate root

岁酱吖の 提交于 2019-12-07 08:25:28
I am trying to follow the steps given at http://www.rohitmenon.com/index.php/apache-oozie-installation/ Note: I am not using cloudera distibution of hadoop The above link is similar to http://oozie.apache.org/docs/4.0.1/DG_QuickStart.html but with more descriptive seems to me however while running the below command as a root user i am getting exception ./bin/oozie-setup.sh sharelib create -fs Note: i have two live node shown at dfshealth.jsp . and i have updated the core-site.xml for all three(including namenode) with property as below <property> <name>hadoop.proxyuser.root.hosts</name> <value

Oozie coordinator with sysdate as start time

江枫思渺然 提交于 2019-12-07 08:21:46
问题 I want to run oozie coordinator with start time as sysdate. How do I do that? is it possible to put sysdate as start date ? Will it catch up? 回答1: You can make coorodinator's "start" refer to a variable - startTime, then overwrite its value with sysdate from command line, such as: oozie job -run -config ./coord.properties -DstartTime=`date -u "+%Y-%m-%dT%H:00Z"` adjust the time format if you are not using UTC time zone in your system. sample coordinator job xml: <coordinator-app name="my

Passing parameters from one action to another in Oozie

假如想象 提交于 2019-12-07 07:29:22
I have a following shell script: DATE= date +"%d%b%y" -d "-1 days" How can I pass DATE to a Java action? You can capture output from shell script and pass it to java action.In the shell script , echo the property like 'dateVariable=${DATE}' and add the capture-output element int the shell action. This will let you capture dateVariable from shell script.In the java action, You can pass the captured variable as parameter as ${wf:actionData('shellAction')['dateVariable']} where shellAction is the shell action name. Sample workflow :- <?xml version="1.0" encoding="UTF-8"?> <workflow-app xmlns="uri

Imported Failed: Cannot convert SQL type 2005==> during importing CLOB data from Oracle database

拟墨画扇 提交于 2019-12-07 06:05:28
I am trying to import a Oracle table's data with CLOB data type using sqoop and it is failing with the error Imported Failed: Cannot convert SQL type 2005 . I am using Running Sqoop version: 1.4.5-cdh5.4.7 . Please help me how to import CLOB data type. I am using the below oozie workflow to import the data <workflow-app xmlns="uri:oozie:workflow:0.4" name="EBIH_Dly_tldb_dly_load_wf"> <credentials> <credential name="hive2_cred" type="hive2"> <property> <name>hive2.jdbc.url</name> <value>${hive2_jdbc_uri}</value> </property> <property> <name>hive2.server.principal</name> <value>${hive2_server

Can I rename the oozie job name dynamically

旧时模样 提交于 2019-12-07 04:12:33
问题 We have a Hadoop service in which we have multiple applications. We need to process the data for each of the applications by reexecuting the same workflow. These are scheduled to execute at the same time of the day. The issue is that when these jobs are running its hard to know for which application the job is running/failed/succeeded. Ofcourse, I can open the job coonfiguration and know it but that does take time since there are 10s of applications running under that service. Is there any