Chaining multiple mapreduce tasks in Hadoop streaming

前端 未结 4 711
余生分开走
余生分开走 2020-12-15 12:45

I am in scenario where I have two mapreduce jobs. I am more comfortable with python and planning to use it for writing mapreduce scripts and use hadoop streaming for the sam

4条回答
  •  自闭症患者
    2020-12-15 13:13

    If you are already writing your mapper and reducer in Python, I would consider using Dumbo where such an operation is straightforward. The sequence of your map reduce jobs, your mapper, reducer etc. are all in one python script that can be run from the command line.

提交回复
热议问题