问题
We're using Storm with Kafka and ZooKeeper. We had a situation where we had to delete some topics and recreate them with different names. Our Kafka spouts stayed the same, aside from now reading from the new topic names. However now the spouts are using the offsets from the old topic partitions when trying to read from the new topics. So the tail position of my-topic-name partition 0 will be 500 but the offset will be something like 10000.
Is there a way to reset the offset position so it matches the tail of the topic?
回答1:
There a multiple options (as Storm's KafkaSpout
does not provide any API to define the starting offset).
- If you want to consumer from the tail of the log you should delete old offsets
- depending on you Kafka version
- (pre 0.9) you can manipulate ZK (which is a little tricky)
- (0.9+) or you try do delete the offset from the topic
__consumer_offsets
(which is also tricky and might delete other offset you want to preserve, too)
- if no offsets are there, you can restart your spout with auto offset reset policy "latest" or "largest" (depending on you Kafka version)
- depending on you Kafka version
- as an alternative (which I would recommend), you can write a small client application that uses
seek()
to manipulate the offset in the way you need them andcommit()
the offsets. This client must use the same group ID as youKafkaSpout
and must subscribe to the same topic(s). Furthermore, you need to make sure that this client application is running a single consumer group member so it get's all partitions assigned.- for this, you an either seek to the end of the log and commit
- or you commit an invalid offset (like -1) and rely on auto offset reset configuration"latest" or "largest" (depending on you Kafka version)
For Kafka Streams, there is a "Application Reset Tool" that does a similar thing to manipulate committed offsets. If you want to get some details, you can read this blog post http://www.confluent.io/blog/data-reprocessing-with-kafka-streams-resetting-a-streams-application/
(disclaimer: I am the author of the post and it is about Kafka Streams -- nevertheless, the underlying offset manipulation ideas are the same)
来源:https://stackoverflow.com/questions/40271847/how-to-reset-kafka-offsets-to-match-tail-position