Unable to run example spark job with oozie

99封情书 提交于 2019-12-11 12:23:32

问题


I'm trying to setup oozie on a CDH 5.7 cluster. I've installed and configured everything by following steps from cloudera documentation. Finally I extracted oozie-examples.tar.gz, -put it to hdfs and tried to run some examples. MR example runs fine, but the spark one fails with the following error:

Resource hdfs://cluster/user/hdfs/.sparkStaging/application_1462195303197_0009/oozie-examples.jar changed on src filesystem (expected 1462196523983, was 1462196524951

The command I used to run the example was:

oozie job -config /usr/share/doc/oozie/examples/apps/spark/job.properties -run

The contents of job.properties:

nameNode=hdfs://cluster:8020
jobTracker=aleo-master-0:8021
master=yarn-cluster
queueName=default
examplesRoot=examples
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/spark

And workflow.xml:

<workflow-app xmlns='uri:oozie:workflow:0.5' name='SparkFileCopy'>
<start to='spark-node' />

<action name='spark-node'>
    <spark xmlns="uri:oozie:spark-action:0.1">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <prepare>
            <delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/spark"/>
        </prepare>
        <master>${master}</master>
        <name>Spark-FileCopy</name>
        <class>org.apache.oozie.example.SparkFileCopy</class>
        <jar>${nameNode}/user/${wf:user()}/${examplesRoot}/apps/spark/lib/oozie-examples.jar</jar>
        <arg>${nameNode}/user/${wf:user()}/${examplesRoot}/input-data/text/data.txt</arg>
        <arg>${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/spark</arg>
    </spark>
    <ok to="end" />
    <error to="fail" />
</action>

<kill name="fail">
    <message>Workflow failed, error
        message[${wf:errorMessage(wf:lastErrorNode())}]
    </message>
</kill>
<end name='end' />

Version information:

  1. Spark 1.6.0
  2. Oozie 4.1.0-cdh5.7.0

Has anyone seen this problem before? I also tried running SparkPi with my own workflow definition, but the result was the same.

Thanks for help!


回答1:


Did you try to clean up sparks staging path? Spark is copying a temp copy of the given jar into its staging hdfs path and may not be able to distinguish two different jars with the same name in there.



来源:https://stackoverflow.com/questions/36984908/unable-to-run-example-spark-job-with-oozie

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!