Kafka message codec - compress and decompress

纵然是瞬间 提交于 2019-11-29 11:09:18

As per my understanding goes the de-compression is taken care by the Consumer it self. As mentioned in their official wiki page The consumer iterator transparently decompresses compressed data and only returns an uncompressed message

As found in this article the way consumer works is as follows

The consumer has background “fetcher” threads that continuously fetch data in batches of 1MB from the brokers and add it to an internal blocking queue. The consumer thread dequeues data from this blocking queue, decompresses and iterates through the messages

And also in the doc page under End-to-end Batch Compression its written that

A batch of messages can be clumped together compressed and sent to the server in this form. This batch of messages will be written in compressed form and will remain compressed in the log and will only be decompressed by the consumer.

So it appears that the decompression part is handled in the consumer it self all you need to do is to provide the valid / supported compression type using the compression.codec ProducerConfig attribute while creating the producer. I couldn't find any example or explanation where it says any approach for decompression in the consumer end. Please correct me if I am wrong.

I have the same issue with v0.8.1 and this compression decomression in Kafka is poorly documented other than saying the Consumer should "transparently" decompresses compressed data which it NEVER did.

The example high level consumer client using ConsumerIterator in Kafka web site only works with uncompressed data. Once I enable compression in Producer client, the message never gets into the following "while" loop. Hopefully they should fix this issue asap or they shouldn't claim this feature as some users may use Kafka to transport large size message that needs batching and compression capabilities.

ConsumerIterator <byte[], byte[]> it = stream.iterator();
while(it.hasNext())
{
   String message = new String(it.next().message());
}

I have small doubt regarding decompression on kafka consumer side, if the kafka producer is sending a compressed stream(either GZIP or SNAPPY). It sounds like kafka consumer transparently did decompression on compressed stream at the consumer side. Please correct me on this weather i am not quiet sure here.

Or is there any decompressed example at kafka consumer side if my above understandings are wrong?

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!