How to convert .txt file to Hadoop's sequence file format

后端 未结 7 1600
独厮守ぢ
独厮守ぢ 2020-11-29 01:19

To effectively utilise map-reduce jobs in Hadoop, i need data to be stored in hadoop\'s sequence file format. However,currently the data is only in flat .txt format.Can anyo

7条回答
  •  北海茫月
    2020-11-29 02:15

    It depends on what the format of the TXT file is. Is it one line per record? If so, you can simply use TextInputFormat which creates one record for each line. In your mapper you can parse that line and use it whichever way you choose.

    If it isn't one line per record, you might need to write your own InputFormat implementation. Take a look at this tutorial for more info.

提交回复
热议问题