Generating Multiple Output files with Hadoop 0.20+

后端 未结 2 635
隐瞒了意图╮
隐瞒了意图╮ 2021-01-03 04:52

I am trying to output the results of my reducer to multiple files. The data results are all contained in one file, and the rest of the results are split based on a category

2条回答
  •  青春惊慌失措
    2021-01-03 05:35

    You can do this in Hadoop 0.20, just that as mentioned you have to use the older API.

    There's some very rough code to do so in http://github.com/orngejaket/Info_Moist_1_Splicer/tree/master/src/contrib/streaming/src/java/org/infochimps/hadoop/mapred/lib/

    The resulting jar writes each record to a file named after its (sanitized) key.

提交回复
热议问题