How to reset Kafka offsets to match tail position?

风流意气都作罢 提交于 2019-12-10 17:54:10

问题


We're using Storm with Kafka and ZooKeeper. We had a situation where we had to delete some topics and recreate them with different names. Our Kafka spouts stayed the same, aside from now reading from the new topic names. However now the spouts are using the offsets from the old topic partitions when trying to read from the new topics. So the tail position of my-topic-name partition 0 will be 500 but the offset will be something like 10000.

Is there a way to reset the offset position so it matches the tail of the topic?


回答1:


There a multiple options (as Storm's KafkaSpout does not provide any API to define the starting offset).

  1. If you want to consumer from the tail of the log you should delete old offsets
    • depending on you Kafka version
      • (pre 0.9) you can manipulate ZK (which is a little tricky)
      • (0.9+) or you try do delete the offset from the topic __consumer_offsets (which is also tricky and might delete other offset you want to preserve, too)
    • if no offsets are there, you can restart your spout with auto offset reset policy "latest" or "largest" (depending on you Kafka version)
  2. as an alternative (which I would recommend), you can write a small client application that uses seek() to manipulate the offset in the way you need them and commit() the offsets. This client must use the same group ID as you KafkaSpout and must subscribe to the same topic(s). Furthermore, you need to make sure that this client application is running a single consumer group member so it get's all partitions assigned.
    • for this, you an either seek to the end of the log and commit
    • or you commit an invalid offset (like -1) and rely on auto offset reset configuration"latest" or "largest" (depending on you Kafka version)

For Kafka Streams, there is a "Application Reset Tool" that does a similar thing to manipulate committed offsets. If you want to get some details, you can read this blog post http://www.confluent.io/blog/data-reprocessing-with-kafka-streams-resetting-a-streams-application/

(disclaimer: I am the author of the post and it is about Kafka Streams -- nevertheless, the underlying offset manipulation ideas are the same)



来源:https://stackoverflow.com/questions/40271847/how-to-reset-kafka-offsets-to-match-tail-position

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!