How to convert .txt file to Hadoop's sequence file format

后端未结

关注

 7  1600

独厮守ぢ 2020-11-29 01:19

To effectively utilise map-reduce jobs in Hadoop, i need data to be stored in hadoop\'s sequence file format. However,currently the data is only in flat .txt format.Can anyo

7条回答

北海茫月 (楼主)

2020-11-29 02:15

It depends on what the format of the TXT file is. Is it one line per record? If so, you can simply use TextInputFormat which creates one record for each line. In your mapper you can parse that line and use it whichever way you choose.

If it isn't one line per record, you might need to write your own InputFormat implementation. Take a look at this tutorial for more info.

0 讨论(0)

查看其它7个回答
发布评论:

提交评论
- 加载中...