How to use CompressionCodec in Hadoop

倾然丶 夕夏残阳落幕 提交于 2019-12-06 10:44:19
Chris White

You should use the CompressionCodecFactory if you want to use compression outside of the standard OutputFormat handling (as detailed in @linker answer):

CompressionCodecFactory ccf = new CompressionCodecFactory(conf)
CompressionCodec codec = ccf.getCodecByClassName(GzipCodec.class.getName());
OutputStream compressedOutputSream = codec.createOutputStream(outputStream)

You're doing it wrong. The standard way to do this would be:

TextOutputFormat.setOutputCompressorClass(job, GzipCodec.class);

The GzipCodec is a Configurable, you have to initialize it properly if you instantiate it directly (setConf, ...)

Try this and let me know if that works.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!