MultipleTextOutputFormat alternative in new API

前端 未结 3 1204
[愿得一人]
[愿得一人] 2020-12-10 16:11

As it stands out MultipleTextOutputFormat have not been migrated to the new API. So if we need to choose an output directory and output fiename based on the key-value being

3条回答
  •  攒了一身酷
    2020-12-10 16:50

    For the best answer,turn to Hadoop - definitive guide 3rd Ed.(starting pg. 253.)

    An Excerpt from the HDG book -

    "In the old MapReduce API, there are two classes for producing multiple outputs: MultipleOutputFormat and MultipleOutputs. In a nutshell, MultipleOutputs is more fully featured, but MultipleOutputFormat has more control over the output directory structure and file naming. MultipleOutputs in the new API combines the best features of the two multiple output classes in the old API."

    It has an example on how you can control directory structure,file naming and output format using MultipleOutputs API.

    HTH.

提交回复
热议问题