Python hadoop on windows cmd, one mapper and multiple inputs, Error: subprocess failed
问题 I want to execute python file which is related to machine learning and as you know there are two files as inputs (train and test) which are important to make learning process. Also I have no reduce file. I have three doubts to run my command: Using two input files, I used -input file1 -input file2 according to Using multiple mapper inputs in one streaming job on hadoop? Turn off reduce, I used -D mapred.reduce.tasks=0 according to How to write 'map only' hadoop jobs? how to make flush my "sys