Hadoop gzip compressed files

后端 未结 4 1024
攒了一身酷
攒了一身酷 2020-12-09 10:37

I am new to hadoop and trying to process wikipedia dump. It\'s a 6.7 GB gzip compressed xml file. I read that hadoop supports gzip compressed files but can only be processed

4条回答
  •  轮回少年
    2020-12-09 10:53

    Why not ungzip it and use Splittable LZ compression instead?m

    http://blog.cloudera.com/blog/2009/11/hadoop-at-twitter-part-1-splittable-lzo-compression/

提交回复
热议问题