From this guide, I have successfully run the sample exercise. But on running my mapreduce job, I am getting the following error
ERROR streaming.StreamJob: Job not
I ran into this error recently, and my problem turned out to be something as obvious (in hindsight) as these other solutions:
I simply had a bug in my Python code. (In my case, I was using Python v2.7 string formatting whereas the AWS EMR cluster I had was using Python v2.6).
To find the actual Python error, go to Job Tracker web UI (in the case of AWS EMR, port 9100 for AMI 2.x and port 9026 for AMI 3.x); find the failed mapper; open its logs; and read the stderr output.