How does Kafka store offsets for each topic?

Deadly 提交于 2019-12-03 05:54:53

问题


While polling Kafka, I have subscribed to multiple topics using the subscribe() function. Now, I want to set the offset from which I want to read from each topic, without resubscribing after every seek() and poll() from a topic. Will calling seek() iteratively over each of the topic names, before polling for data achieve the result? How are the offsets exactly stored in Kafka?

I have one partition per topic and just one consumer to read from all topics.


回答1:


How does Kafka store offsets for each topic?

Kafka has moved the offset storage from zookeeper to kafka brokers. The reason is below:

Zookeeper is not a good way to service a high-write load such as offset updates because zookeeper routes each write though every node and hence has no ability to partition or otherwise scale writes. We have always known this, but chose this implementation as a kind of "marriage of convenience" since we already depended on zk.

Kafka store the offset commits in a topic, when consumer commit the offset, kafka publish an commit offset message to an "commit-log" topic and keep an in-memory structure that mapped group/topic/partition to the latest offset for fast retrieval. More design infomation could be found in this page about offset management.

Now, I want to set the offset from which I want to read from each topic, without resubscribing after every seek() and poll() from a topic.

There is a new feature about kafka admin tools to reset offset.

kafka-consumer-group.sh --bootstrap-server 127.0.0.1:9092 --group
      your-consumer-group **--reset-offsets** --to-offset 1 --all-topics --execute

There are more options you can use.




回答2:


The offsets of Kafka are stored on consumer side. Each consumer will store its offset of each topic, usually in zookeeper.



来源:https://stackoverflow.com/questions/45686885/how-does-kafka-store-offsets-for-each-topic

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!