kafka-consumer-api

How to control the concurrency of processing messages by ConsumerGroup

≯℡__Kan透↙ 提交于 2020-01-02 23:58:51
问题 I am using kafka-node ConsumerGroup to consume message from a topic. The ConsumerGroup when consumes a message requires calling an external API, which might even take a second to response. I wish to control consuming next message from the queue until I get the response from the API, so that the messages are processed sequentially. How should I control this behavior? 回答1: This is how we have implemented processing of 1 message at a time: var async = require('async'); //npm install async /

Kafka Stream reprocessing old messages on rebalancing

拈花ヽ惹草 提交于 2020-01-02 23:14:17
问题 I have a Kafka Streams application which reads data from a few topics, joins the data and writes it to another topic. This is the configuration of my Kafka cluster: 5 Kafka brokers Kafka topics - 15 partitions and replication factor 3. My Kafka Streams applications are running on the same machines as my Kafka broker. A few million records are consumed/produced per hour. Whenever I take a broker down, the application goes into rebalancing state and after rebalancing many times it starts

Kafka Stream reprocessing old messages on rebalancing

懵懂的女人 提交于 2020-01-02 23:14:16
问题 I have a Kafka Streams application which reads data from a few topics, joins the data and writes it to another topic. This is the configuration of my Kafka cluster: 5 Kafka brokers Kafka topics - 15 partitions and replication factor 3. My Kafka Streams applications are running on the same machines as my Kafka broker. A few million records are consumed/produced per hour. Whenever I take a broker down, the application goes into rebalancing state and after rebalancing many times it starts

Apache Kafka get list of consumers on a specific topic

随声附和 提交于 2020-01-02 19:31:52
问题 As it can be guest from the title, is there a way to get the consumer list on a specific topic in java? Untill now Im able to get the list of topics like this final ListTopicsResult listTopicsResult = adminClient.listTopics(); KafkaFuture<Set<String>> kafkaFuture = listTopicsResult.names(); Set<String> map = kafkaFuture.get(); but I havent found a way to get the list of consumers on each topic 回答1: I was recently solving the same problem for my kafka client tool. It is not easy, but the only

Continuous consumer group rebalancing with more consumers than partitions

試著忘記壹切 提交于 2020-01-02 10:07:00
问题 Given the following setup: Kafka v0.11.0.0 3 brokers 2 topics, each with 2 partitions, replication factor of 3 2 consumer groups, one for each topic 3 servers that contain consumers The servers contain two consumers, one for each topic such that: Server A consumer-A1 in group topic-1-group consuming topic-1 consumer-A2 in group topic-2-group consuming topic-2 Server B consumer-B1 in group topic-1-group consuming topic-1 consumer-B2 in group topic-2-group consuming topic-2 Server C consumer-C1

Kafka consumer startup delay confluent dotnet

旧街凉风 提交于 2020-01-02 06:28:31
问题 When starting up a confluent-dotnet consumer , after the call to subscribe and subsequent polling, it seems to take a very long time to receive the "Partition assigned" event from the server, and therefore messages (about 10-15sec). At first I thought there was a auto topic creation overhead, but the time is the same whether the topic/consumer group of the consumer already exist or not. I start my consumer with this config, the rest of the code is the same as in the confluent advanced

Topics, partitions and keys

点点圈 提交于 2019-12-31 09:14:08
问题 I am looking for some clarification on the subject. In Kafka documentations I found the following: Kafka only provides a total order over messages within a partition, not between different partitions in a topic. Per-partition ordering combined with the ability to partition data by key is sufficient for most applications. However, if you require a total order over messages this can be achieved with a topic that has only one partition, though this will mean only one consumer process per

How to get latest offset for a partition for a kafka topic?

僤鯓⒐⒋嵵緔 提交于 2019-12-31 08:58:07
问题 I am using the Python high level consumer for Kafka and want to know the latest offsets for each partition of a topic. However I cannot get it to work. from kafka import TopicPartition from kafka.consumer import KafkaConsumer con = KafkaConsumer(bootstrap_servers = brokers) ps = [TopicPartition(topic, p) for p in con.partitions_for_topic(topic)] con.assign(ps) for p in ps: print "For partition %s highwater is %s"%(p.partition,con.highwater(p)) print "Subscription = %s"%con.subscription()

How to check which partition is a key assign to in kafka?

巧了我就是萌 提交于 2019-12-31 03:52:05
问题 I am trying to debug a issue for which I am trying to prove that each distinct key only goes to 1 partition if the cluster is not rebalancing. So I was wondering for a given topic, is there a way to determine which partition a key is send to? 回答1: As explained here or also in the source code You need the byte[] keyBytes assuming it isn't null, then using org.apache.kafka.common.utils.Utils , you can run the following. Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions; For strings or

How to consume from two different clusters in Kafka?

守給你的承諾、 提交于 2019-12-31 02:51:46
问题 I have two kafka clusters say A and B, B is replica of A. I would like to consume messages from cluster B only if A is down and viceversa. Nevertheless consuming messages from both the clusters would result in duplicate messages. So is there any way I can configure my kafka consumer to receive messages from only one cluster. Thanks-- 回答1: So is there any way I can configure my kafka consumer to receive messages from only one cluster. Yes: a Kafka consumer instance will always receive messages