Hadoop Error: Error launching job , bad input path : File does not exist.Streaming Command Failed

坚强是说给别人听的谎言 提交于 2019-12-23 05:17:17

问题


I am running an MRJob on Hadoop cluster & I am getting the following error:

No configs found; falling back on auto-configuration
Looking for hadoop binary in $PATH...
Found hadoop binary: /usr/local/hadoop/bin/hadoop
Using Hadoop version 2.7.3
Looking for Hadoop streaming jar in /usr/local/hadoop...
Found Hadoop streaming jar: /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.7.3.jar
Creating temp directory /tmp/Mr_Jobs.hduser.20170227.030012.446820
Copying local files to hdfs:///user/hduser/tmp/mrjob/Mr_Jobs.hduser.20170227.030012.446820/files/...
Running step 1 of 1...
  session.id is deprecated. Instead, use dfs.metrics.session-id
  Initializing JVM Metrics with processName=JobTracker, sessionId=
  Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
  Cleaning up the staging area file:/app/hadoop/tmp/mapred/staging/hduser1748755362/.staging/job_local1748755362_0001
  Error launching job , bad input path : File does not exist: /app/hadoop/tmp/mapred/staging/hduser1748755362/.staging/job_local1748755362_0001/files/Mr_Jobs.py#Mr_Jobs.py
  Streaming Command Failed!
Attempting to fetch counters from logs...
Can't fetch history log; missing job ID
No counters found
Scanning logs for probable cause of failure...
Can't fetch history log; missing job ID
Can't fetch task logs; missing application ID
Step 1 of 1 failed: Command '['/usr/local/hadoop/bin/hadoop', 'jar', '/usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.7.3.jar', '-files', 'hdfs:///user/hduser/tmp/mrjob/Mr_Jobs.hduser.20170227.030012.446820/files/Mr_Jobs.py#Mr_Jobs.py,hdfs:///user/hduser/tmp/mrjob/Mr_Jobs.hduser.20170227.030012.446820/files/mrjob.zip#mrjob.zip,hdfs:///user/hduser/tmp/mrjob/Mr_Jobs.hduser.20170227.030012.446820/files/setup-wrapper.sh#setup-wrapper.sh', '-input', 'hdfs:///user/hduser/tmp/mrjob/Mr_Jobs.hduser.20170227.030012.446820/files/File.txt', '-output', 'hdfs:///user/hduser/tmp/mrjob/Mr_Jobs.hduser.20170227.030012.446820/output', '-mapper', 'sh -ex setup-wrapper.sh python3 Mr_Jobs.py --step-num=0 --mapper', '-combiner', 'sh -ex setup-wrapper.sh python3 Mr_Jobs.py --step-num=0 --combiner', '-reducer', 'sh -ex setup-wrapper.sh python3 Mr_Jobs.py --step-num=0 --reducer']' returned non-zero exit status 512

I am running the job via this command :

python3 /home/bhoots21304/Desktop/MrJobs-MR.py -r hadoop hdfs://input3/File.txt

Also First line says: No configs found; falling back on auto-configuration

I looked up online. It says there should be file by the name of mrjob.conf in /etc/ folder.But it's not present anywhere in my filesystem. Do i need to create this file. If so what should be it's contents.

I installed hadoop using the instructions mentioned in this file:

https://github.com/ev2900/Dev_Notes/blob/master/Hadoop/notes.txt

Also hadoop-env.sh, core-site.xml, mapred-site.xml, hdfs-site.xml are configured well because its working if i just run a simple worcount job(without MRJob's)

(Installed MRJob's using 'sudo -H pip3 install mrjob')


回答1:


You need to specify the python-bin and hadoop_streaming_jar in mrjob.conf. It should look something like this, depending on the location of the jar.

runners:
    hadoop:
        python_bin: python3
        hadoop_streaming_jar: /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.7.3.jar


来源:https://stackoverflow.com/questions/42477386/hadoop-error-error-launching-job-bad-input-path-file-does-not-exist-streami

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!