kafka-consumer-api

How to use “li-apache-kafka-clients” in spring boot app to send large message (above 1MB) from Kafka producer?

落花浮王杯 提交于 2020-05-17 08:06:07
问题 How to use li-apache-kafka-clients in spring boot app to send large message (above 1MB) from Kafka producer to Kafka Consumer? Below is the GitHub link of li-apache-kafka-clients: https://github.com/linkedin/li-apache-kafka-clients I have imported .jar file of li-apache-kafka-clients and put the below configuration for producer: props.put("large.message.enabled", "true"); props.put("max.message.segment.bytes", 1000 * 1024); props.put("segment.serializer", DefaultSegmentSerializer.class

Kafka Consumer group rebalancing

六眼飞鱼酱① 提交于 2020-04-18 05:42:43
问题 I'm using kafka consumer group management for processing my messages. The processing time for my messages vary from one another. So I have set the max poll interval to 20 min for max records of 20. And I'm using 5 partition and 5 consumer instances with default config values apart from the above two. But still I'm getting the following error intermittently: [Consumer clientId=consumer-3, groupId=amc_dashboard_analytics] Attempt to heartbeat failed since group is rebalancing The understanding

Kafka Consumer group rebalancing

谁说我不能喝 提交于 2020-04-18 05:42:07
问题 I'm using kafka consumer group management for processing my messages. The processing time for my messages vary from one another. So I have set the max poll interval to 20 min for max records of 20. And I'm using 5 partition and 5 consumer instances with default config values apart from the above two. But still I'm getting the following error intermittently: [Consumer clientId=consumer-3, groupId=amc_dashboard_analytics] Attempt to heartbeat failed since group is rebalancing The understanding

Not able to poll / fetch all records from kafka topic

穿精又带淫゛_ 提交于 2020-04-17 20:28:09
问题 I am trying to poll data from a specific topic like kafka is receiving 100 records/s but most of the time it does not fetch all records. I am using timeout as 5000ms and I am calling this method every 100ms Note : I am subscribing to the specific topic too @Scheduled(fixedDelayString = "100") public void pollRecords() { ConsumerRecords<String, String> records = leadConsumer.poll("5000"); How can I fetch all the data from kafka ? 回答1: Maximum number of records returned from poll() is specified

Not able to poll / fetch all records from kafka topic

笑着哭i 提交于 2020-04-17 20:27:43
问题 I am trying to poll data from a specific topic like kafka is receiving 100 records/s but most of the time it does not fetch all records. I am using timeout as 5000ms and I am calling this method every 100ms Note : I am subscribing to the specific topic too @Scheduled(fixedDelayString = "100") public void pollRecords() { ConsumerRecords<String, String> records = leadConsumer.poll("5000"); How can I fetch all the data from kafka ? 回答1: Maximum number of records returned from poll() is specified

If I have Transactional Producer in Kafka can I read exactly once messages with Kafka Streams?

走远了吗. 提交于 2020-03-26 04:28:29
问题 I would like to have Exactly-once semantics, but I don't want to read message with Consumer. I'd rather read messages with Kafka Streams AP. If I add processing.guarantee=exactly_once to Stream configuration, will exactly-once semantics be kept? 回答1: Exactly-once processing is based on a read-process-write pattern. Kafka Streams uses this pattern and thus, if you write a regular Kafka Streams application that writes the result back to a Kafka topic, you will get exactly-once processing

If I have Transactional Producer in Kafka can I read exactly once messages with Kafka Streams?

和自甴很熟 提交于 2020-03-26 04:28:03
问题 I would like to have Exactly-once semantics, but I don't want to read message with Consumer. I'd rather read messages with Kafka Streams AP. If I add processing.guarantee=exactly_once to Stream configuration, will exactly-once semantics be kept? 回答1: Exactly-once processing is based on a read-process-write pattern. Kafka Streams uses this pattern and thus, if you write a regular Kafka Streams application that writes the result back to a Kafka topic, you will get exactly-once processing

Difference between session.timeout.ms and max.poll.interval.ms for Kafka

旧巷老猫 提交于 2020-03-23 15:39:30
问题 AFAIK, max.poll.interval.ms is introduced in Kafka 0.10.1. However it is still unclear that when we can use both session.timeout.ms and max.poll.interval.ms Consider the use casein which heartbeat thread is not responding, but my processing thread as it has higher value set, it still is processing the record. But as heartbeat thread is down then after crossing session.timeout.ms, what exactly happens. Because I've observed in POC that consumer re-balance doesn't happen until it reaches max

Difference between session.timeout.ms and max.poll.interval.ms for Kafka

我只是一个虾纸丫 提交于 2020-03-23 15:39:22
问题 AFAIK, max.poll.interval.ms is introduced in Kafka 0.10.1. However it is still unclear that when we can use both session.timeout.ms and max.poll.interval.ms Consider the use casein which heartbeat thread is not responding, but my processing thread as it has higher value set, it still is processing the record. But as heartbeat thread is down then after crossing session.timeout.ms, what exactly happens. Because I've observed in POC that consumer re-balance doesn't happen until it reaches max

Join multiple Kafka topics by key

醉酒当歌 提交于 2020-03-16 05:45:13
问题 How can write a consumer that joins multiple Kafka topics in a scalable way? I have a topic that published events with a key and a second topic that publishes other events related to a subset of the first with the same key. I would like to write a consumer that subscribes to both topics and performs some additional actions for the subset that appears in both topics. I can do this easily with a single consumer: read everything from both topics, maintaining state locally and perform the actions