Running a job using hadoop streaming and mrjob: PipeMapRed.waitOutputThreads(): subprocess failed with code 1

前端 未结 4 1615
隐瞒了意图╮
隐瞒了意图╮ 2020-12-06 02:35

Hey I\'m fairly new to the world of Big Data. I came across this tutorial on http://musicmachinery.com/2011/09/04/how-to-process-a-million-songs-in-20-minutes/

It d

4条回答
  •  无人及你
    2020-12-06 03:04

    Error code 1 is a generic error for Hadoop Streaming. You can get this error code for two main reasons:

    • Your Mapper and Reducer scripts are not executable (include the #!/usr/bin/python at the beginning of the script).

    • Your Python program is simply written wrong - you could have a syntax error or logical bug.

    Unfortunately, error code 1 does not give you any details to see exactly what is wrong with your Python program.

    I was stuck with error code 1 for a while myself, and the way I figured it out was to simply run my Mapper script as a standalone python program: python mapper.py

    After doing this, I got a regular Python error that told me I was simply giving a function the wrong type of argument. I fixed my syntax error, and everything worked after that. So if possible, I'd run your Mapper or Reducer script as a standalone Python program to see if that gives you any insight on the reasoning for your error.

提交回复
热议问题