问题:

I have a shell script which executes sqoop job. The script is below.

!#/bin/bash  table=$1  sqoop job --exec ${table}

Now when I pass the table name in the workflow I get the sqoop job to be executed successfully.

The workflow is below.

<workflow-app name="Shell_script" xmlns="uri:oozie:workflow:0.5"> <start to="shell"/> <kill name="Kill">     <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <action name="shell_script">     <shell xmlns="uri:oozie:shell-action:0.1">         <job-tracker>${jobTracker}</job-tracker>         <name-node>${nameNode}</name-node>         <exec>sqoopjob.sh</exec>         <argument>test123</argument>         <file>/user/oozie/sqoop/lib/sqoopjob.sh#sqoopjob.sh</file>     </shell>     <ok to="End"/>     <error to="Kill"/>     </action>     <end name="End"/> </workflow-app>

The job executes successfully for table test123.

Now I have 300 sqoop jobs same like above. I want to execute 10 sqoop jobs in parallel. All the table names are in a single file.

Now I want to loop to the file and execute 10 sqoop jobs for first 10 tables and so on.

How can I do this? should I prepare 10 workflows? I am literally confused.

回答1:

As @Samson Scharfrichter mentioned you can start parallel jobs in the shell script. Make a function runJob() in shell and run it in parallel. Use this template:

#!/bin/bash  runJob() { tableName="$1" #add other parameters here  #call sqoop here or do something else #write command logs #etc, etc #return 0 on success, return 1 on fail  return 0 }  #Run parallel processes and wait for their completion  #Add loop here or add more calls runJob $table_name & runJob $table_name2 & runJob $table_name3 & #Note the ampersand in above commands says to create parallel process  #Now wait for all processes to complete FAILED=0  for job in `jobs -p` do    echo "job=$job"    wait $job || let "FAILED+=1" done  if [ "$FAILED" != "0" ]; then     echo "Execution FAILED!  ($FAILED)"     #Do something here, log or send messege, etc      exit 1 fi  #All processes are completed successfully! #Do something here echo "Done successfully"

转载请标明出处:sqoop job shell script execute parallel in oozie

文章来源: sqoop job shell script execute parallel in oozie

标签

shell

sqoop

脚本