kafka-consumer-api

Kafka Consumers are balanced across topics

给你一囗甜甜゛ 提交于 2019-12-25 00:27:25
问题 I am creating 6 consumers with same group-id. All the consumers are subscribing to 2 topics with 3 partitions each. Since there are 6 consumers and 6 partitions across 2 topics I am expecting all the consumers to be used. But I don't see all the consumers being used, is there a way I can force it to re balance ? I am using kafka 0.10.2.0 回答1: Assigning partitions to consumers in the same consumer group is not done across topics. Following what is happening to you ... You have t1_p0, t1_p1 and

Cannot produce Message when Main Thread sleep less than 1000

浪尽此生 提交于 2019-12-24 18:39:47
问题 When I am using the Java API of Kafka,if I let my main Thread sleep less than 2000ns,it cannot prodece any message.I really want to know why this happen? Here is my producer: public class Producer { private final KafkaProducer<String, String> producer; private final String topic; public Producer(String topic, String[] args) { //...... //...... producer = new KafkaProducer<>(props); this.topic = topic; } public void producerMsg() throws InterruptedException { String data = "Apache Storm is a

Why isn't this Kafka consumer shutting down?

亡梦爱人 提交于 2019-12-24 15:27:16
问题 I am expecting the consume test to read just one message and shutdown. However, it isn't, even after I am calling the consumer.shutdown() . Wondering why? Test public class AppTest { App app=new App(); @org.junit.Test public void publish() { assertTrue(app.publish("temptopic","message")); } @org.junit.Test public void consume() { app.publish("temptopic","message"); assertEquals("message",app.consume("temptopic","tempgroup")); } } Class Under Test public class App { public boolean publish

How to get kafka offset data, specified on timestamp

ⅰ亾dé卋堺 提交于 2019-12-24 10:08:21
问题 I've tried to get the offset from Kafka topic based on timestamp when I tried to run it was throwing null pointer error, Map<TopicPartition, Long> timestampsToSearch = new HashMap<>(); for (TopicPartition partition : partitions) { timestampsToSearch.put(partition, startTimestamp); } Map<TopicPartition, OffsetAndTimestamp> outOffsets = consumer.offsetsForTimes(timestampsToSearch); for (TopicPartition partition : partitions) { Long seekOffset = outOffsets.get(partition).offset(); consumer.seek

Kafka Proper Way to Poll No Records

孤街醉人 提交于 2019-12-24 08:37:27
问题 for keeping my consumer alive (very long variable length processing) I'm implementing a empty poll() call in a background thread that will keep the broker from rebalancing if I spend too much time between polls(). I have set my poll-interval to be very long, but I don't want to just keep increasing it forever for longer and longer processing. What's the proper way to poll for no records? Currently I'm calling poll(), then re-seeking back to the earliest offsets for each partition returned in

Kafka AvroConsumer consume from timestamp using offsets_for_times

亡梦爱人 提交于 2019-12-24 07:47:48
问题 Trying to use confluent_kafka.AvroConsumer to consume messages from a given time stamp. if flag: # creating a list topic_partitons_to_search = list( map(lambda p: TopicPartition('my_topic2', p, int(time.time())), range(0, 1))) print("Searching for offsets with %s" % topic_partitons_to_search) offsets = c.offsets_for_times(topic_partitons_to_search, timeout=1.0) print("offsets_for_times results: %s" % offsets) for x in offsets: c.seek(x) flag=False console returns this Searching for offsets

Caching DStream in Spark Streaming

家住魔仙堡 提交于 2019-12-24 07:40:04
问题 I have a Spark streaming process which reads data from kafka, into a DStream. In my pipeline I do two times (one after another): DStream.foreachRDD( transformations on RDD and inserting into destination). (each time I do different processing and insert data to different destination). I was wondering how would ​DStream.cache​, right after I read data from Kafka work? Is it possible to do it? Is the process now actually reading data two times from Kafka? Please keep in mind, that it is not

How change Kafka committed consumer offset with required offset

烂漫一生 提交于 2019-12-24 06:33:31
问题 I have Kafka Stream application. My application is processing the events successfully. How to change Kafka committed consumer offset with required offset to reprocess/skip the events. I tried How to change start offset for topic?. But I got 'Node does not exist:' error. Please help me. 回答1: The question/answer you are referring to is based on an older Kafka version. Since Kafka 0.9, offsets are not committed to ZooKeeper but store in a special Kafka topic called the offset topic (topic name

Kafka Consumer seektoBeginning

孤人 提交于 2019-12-24 03:07:58
问题 I did not use a partition to publish to Kafka topic. ProducerRecord(String topic, K key, V value) In the consumer, I would like to go to the beginning. seekToBeginning(Collection partitions) Is it possible to seek to beginning without using a partition? Does Kafka assign a default partition? https://kafka.apache.org/0102/javadoc/org/apache/kafka/clients/producer/ProducerRecord.html https://kafka.apache.org/0102/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html 回答1: When producing,

Kafka replays messages over and over - Heartbeat session expired - marking coordinator dead

断了今生、忘了曾经 提交于 2019-12-24 03:04:48
问题 Using python kafka api to read messages from a topic with only a handful of messages in it. Kafka keeps on replaying the messages in the queue over and over again. It receives a message from my topic (comes back with each message content), then throws ERROR - Heartbeat session expired - marking coordinator dead and keeps on looping through rest of messages and keeps on replaying them. more logs: kafka.coordinator - ERROR - Heartbeat session expired - marking coordinator dead kafka.coordinator