问题
I'm trying to setup oozie on a CDH 5.7 cluster. I've installed and configured everything by following steps from cloudera documentation. Finally I extracted oozie-examples.tar.gz, -put it to hdfs and tried to run some examples. MR example runs fine, but the spark one fails with the following error:
Resource hdfs://cluster/user/hdfs/.sparkStaging/application_1462195303197_0009/oozie-examples.jar changed on src filesystem (expected 1462196523983, was 1462196524951
The command I used to run the example was:
oozie job -config /usr/share/doc/oozie/examples/apps/spark/job.properties -run
The contents of job.properties:
nameNode=hdfs://cluster:8020
jobTracker=aleo-master-0:8021
master=yarn-cluster
queueName=default
examplesRoot=examples
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/spark
And workflow.xml:
<workflow-app xmlns='uri:oozie:workflow:0.5' name='SparkFileCopy'>
<start to='spark-node' />
<action name='spark-node'>
<spark xmlns="uri:oozie:spark-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/spark"/>
</prepare>
<master>${master}</master>
<name>Spark-FileCopy</name>
<class>org.apache.oozie.example.SparkFileCopy</class>
<jar>${nameNode}/user/${wf:user()}/${examplesRoot}/apps/spark/lib/oozie-examples.jar</jar>
<arg>${nameNode}/user/${wf:user()}/${examplesRoot}/input-data/text/data.txt</arg>
<arg>${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/spark</arg>
</spark>
<ok to="end" />
<error to="fail" />
</action>
<kill name="fail">
<message>Workflow failed, error
message[${wf:errorMessage(wf:lastErrorNode())}]
</message>
</kill>
<end name='end' />
Version information:
- Spark 1.6.0
- Oozie 4.1.0-cdh5.7.0
Has anyone seen this problem before? I also tried running SparkPi with my own workflow definition, but the result was the same.
Thanks for help!
回答1:
Did you try to clean up sparks staging path? Spark is copying a temp copy of the given jar into its staging hdfs path and may not be able to distinguish two different jars with the same name in there.
来源:https://stackoverflow.com/questions/36984908/unable-to-run-example-spark-job-with-oozie