Permission denied error 13 - Python on Hadoop

末鹿安然 提交于 2020-01-15 07:12:49

问题


I am running a simple Python mapper and reducer and am getting 13 permission denied error. Need help.

I am not sure what is happening here and need help. New to Hadoop world.

I am running simple map reduce for counting word. The mapper and reducer are running independently on linus or windows powershell

======================================================================


hadoop@ubuntu:~/hadoop-1.2.1$ bin/hadoop jar contrib/streaming/hadoop-streaming-1.2.1.jar -file /home/hadoop/mapper.py -mapper mapper.py -file /home/hadoop/reducer.py -reducer reducer.py -input /deepw/pg4300.txt -output /deepw/pg3055
Warning: $HADOOP_HOME is deprecated.

packageJobJar: [/home/hadoop/mapper.py, /home/hadoop/reducer.py, /tmp/hadoop-hadoop/hadoop-unjar2961168567699201508/] [] /tmp/streamjob4125164474101219622.jar tmpDir=null
15/09/23 14:39:16 INFO util.NativeCodeLoader: Loaded the native-hadoop library
15/09/23 14:39:16 WARN snappy.LoadSnappy: Snappy native library not loaded
15/09/23 14:39:16 INFO mapred.FileInputFormat: Total input paths to process : 1
15/09/23 14:39:16 INFO streaming.StreamJob: getLocalDirs(): [/tmp/hadoop-hadoop/mapred/local]
15/09/23 14:39:16 INFO streaming.StreamJob: Running job: job_201509231312_0003
15/09/23 14:39:16 INFO streaming.StreamJob: To kill this job, run:
15/09/23 14:39:16 INFO streaming.StreamJob: /home/hadoop/hadoop-1.2.1/libexec/../bin/hadoop job -Dmapred.job.tracker=192.168.56.102:9001 -kill job_201509231312_0003
15/09/23 14:39:16 INFO streaming.StreamJob: Tracking URL: http://192.168.56.102:50030/jobdetails.jsp?jobid=job_201509231312_0003
15/09/23 14:39:17 INFO streaming.StreamJob: map 0% reduce 0%
15/09/23 14:39:41 INFO streaming.StreamJob: map 100% reduce 100%
15/09/23 14:39:41 INFO streaming.StreamJob: To kill this job, run:
15/09/23 14:39:41 INFO streaming.StreamJob: /home/hadoop/hadoop-1.2.1/libexec/../bin/hadoop job -Dmapred.job.tracker=192.168.56.102:9001 -kill job_201509231312_0003
15/09/23 14:39:41 INFO streaming.StreamJob: Tracking URL: http://192.168.56.102:50030/jobdetails.jsp?jobid=job_201509231312_0003
15/09/23 14:39:41 ERROR streaming.StreamJob: Job not successful. Error: # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201509231312_0003_m_000000
15/09/23 14:39:41 INFO streaming.StreamJob: killJob...
Streaming Command Failed!

================================================================
java.io.IOException: Cannot run program "/tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201509231312_0003/attempt_201509231312_0003_m_000001_3/work/./mapper.py": error=13, Permission denied
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214)
at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.io.IOException: error=13, Permission denied
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:186)
at java.lang.ProcessImpl.start(ProcessImpl.java:130)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
... 24 more

回答1:


It seems your mapper file is not executable. Try chmod a+x mapper.py before submitting your job.

Alternatively, in your command, you can replace

-mapper mapper.py

with

-mapper "python mapper.py"



回答2:


As a note, I recently also had this error13 problem. However, in my case, the problem was that the directory the python executable and the mappers/reducers were in had a permissions problem; it was not readable by others. After a chmod a+rx , my problem was fixed.




回答3:


After doing chmod a+x for mapper and reduce .py files I am getting below exceptions (with python keyword added to mapper it works fine and produces right results).

========================================================================================

5-09-28 13:25:16,572 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2015-09-28 13:25:16,752 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /deep/mapred/local/taskTracker/hadoop/jobcache/job_201509281234_0015/jars/META-INF <- /deep/mapred/local/taskTracker/hadoop/jobcache/job_201509281234_0015/attempt_201509281234_0015_m_000000_3/work/META-INF
2015-09-28 13:25:16,761 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /deep/mapred/local/taskTracker/hadoop/jobcache/job_201509281234_0015/jars/reducer.py <- /deep/mapred/local/taskTracker/hadoop/jobcache/job_201509281234_0015/attempt_201509281234_0015_m_000000_3/work/reducer.py
2015-09-28 13:25:16,763 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /deep/mapred/local/taskTracker/hadoop/jobcache/job_201509281234_0015/jars/job.jar <- /deep/mapred/local/taskTracker/hadoop/jobcache/job_201509281234_0015/attempt_201509281234_0015_m_000000_3/work/job.jar
2015-09-28 13:25:16,766 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /deep/mapred/local/taskTracker/hadoop/jobcache/job_201509281234_0015/jars/.job.jar.crc <- /deep/mapred/local/taskTracker/hadoop/jobcache/job_201509281234_0015/attempt_201509281234_0015_m_000000_3/work/.job.jar.crc
2015-09-28 13:25:16,769 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /deep/mapred/local/taskTracker/hadoop/jobcache/job_201509281234_0015/jars/org <- /deep/mapred/local/taskTracker/hadoop/jobcache/job_201509281234_0015/attempt_201509281234_0015_m_000000_3/work/org
2015-09-28 13:25:16,771 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /deep/mapred/local/taskTracker/hadoop/jobcache/job_201509281234_0015/jars/mapper.py <- /deep/mapred/local/taskTracker/hadoop/jobcache/job_201509281234_0015/attempt_201509281234_0015_m_000000_3/work/mapper.py
2015-09-28 13:25:17,046 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2015-09-28 13:25:17,176 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2015-09-28 13:25:17,184 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1e7c7fb
2015-09-28 13:25:17,254 INFO org.apache.hadoop.mapred.MapTask: Processing split: hdfs://192.168.56.101:9000/swad/4300.txt:0+786539
2015-09-28 13:25:17,275 WARN org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library not loaded
2015-09-28 13:25:17,287 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
2015-09-28 13:25:17,296 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 100
2015-09-28 13:25:17,393 INFO org.apache.hadoop.mapred.MapTask: data buffer = 79691776/99614720
2015-09-28 13:25:17,393 INFO org.apache.hadoop.mapred.MapTask: record buffer = 262144/327680
2015-09-28 13:25:17,419 INFO org.apache.hadoop.streaming.PipeMapRed: PipeMapRed exec [/deep/mapred/local/taskTracker/hadoop/jobcache/job_201509281234_0015/attempt_201509281234_0015_m_000000_3/work/./mapper.py]
2015-09-28 13:25:17,436 ERROR org.apache.hadoop.streaming.PipeMapRed: configuration exception
java.io.IOException: Cannot run program "/deep/mapred/local/taskTracker/hadoop/jobcache/job_201509281234_0015/attempt_201509281234_0015_m_000000_3/work/./mapper.py": error=2, No such file or directory
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
    at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214)
    at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.io.IOException: error=2, No such file or directory
    at java.lang.UNIXProcess.forkAndExec(Native Method)
    at java.lang.UNIXProcess.<init>(UNIXProcess.java:186)
    at java.lang.ProcessImpl.start(ProcessImpl.java:130)
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
    ... 24 more
2015-09-28 13:25:17,462 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2015-09-28 13:25:17,495 INFO org.apache.hadoop.io.nativeio.NativeIO: Initialized cache for UID to User mapping with a cache timeout of 14400 seconds.
2015-09-28 13:25:17,496 INFO org.apache.hadoop.io.nativeio.NativeIO: Got UserName hadoop for UID 1000 from the native implementation
2015-09-28 13:25:17,498 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 17 more
Caused by: java.lang.RuntimeException: configuration exception
    at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:230)
    at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
    ... 22 more
Caused by: java.io.IOException: Cannot run program "/deep/mapred/local/taskTracker/hadoop/jobcache/job_201509281234_0015/attempt_201509281234_0015_m_000000_3/work/./mapper.py": error=2, No such file or directory
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
    at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214)
    ... 23 more
Caused by: java.io.IOException: error=2, No such file or directory
    at java.lang.UNIXProcess.forkAndExec(Native Method)
    at java.lang.UNIXProcess.<init>(UNIXProcess.java:186)
    at java.lang.ProcessImpl.start(ProcessImpl.java:130)
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
    ... 24 more
2015-09-28 13:25:17,506 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task



回答4:


I also struggled with this as well. I found that when I was running on a single node, the Cloudera QuickStart VM, it all worked but on a cluster it didn't. It seems the python scripts are not being shipped to the nodes for execution.

There is another parameter "-file" which ships a file or directory as part of the job. It is mentioned here:

https://wiki.apache.org/hadoop/HadoopStreaming

You can specify this file multiple times, once for the mapper and again for the reducer, like this:

hadoop jar /opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/lib/hadoop-mapreduce/hadoop-streaming.jar -input /user/linux/input -output /user/linux/output_new -mapper wordcount_mapper.py -reducer wordcount_reducer.py -file /home/linux/wordcount_mapper.py -file /home/linux/wordcount_reducer.py

or you can package the scripts in a directory and ship just the directory, like this:

hadoop jar /opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/lib/hadoop-mapreduce/hadoop-streaming.jar -input /user/linux/input -output /user/linux/output_new -mapper wc/wordcount_mapper.py -reducer wc/wordcount_reducer.py -file /home/linux/wc

Note here I refer to the mapper and reducer scripts with a relative path.

The comment about the file being readable and executable is also correct.

It took me a while to work this out. I hope it helps.



来源:https://stackoverflow.com/questions/32735668/permission-denied-error-13-python-on-hadoop

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!