kafka-consumer-api

Difference between heartbeat.interval.ms and session.timeout.ms in Kafka consumer config

北慕城南 提交于 2020-02-01 01:17:04
问题 I'm currently running kafka 0.10.0.1 and the corresponding docs for the two values in question are as follows: heartbeat.interval.ms - The expected time between heartbeats to the consumer coordinator when using Kafka's group management facilities. Heartbeats are used to ensure that the consumer's session stays active and to facilitate rebalancing when new consumers join or leave the group. The value must be set lower than session.timeout.ms, but typically should be set no higher than 1/3 of

Kafka consumer exception and offset commits

谁说我不能喝 提交于 2020-01-31 22:56:06
问题 I've been trying to do some POC work for Spring Kafka. Specifically, I wanted to experiment with what are the best practices in terms of dealing with errors while consuming messages within Kafka. I am wondering if anyone is able to help with: Sharing best practices surrounding what Kafka consumers should do when there is a failure Help me understand how AckMode Record works, and how to prevent commits to the Kafka offset queue when an exception is thrown in the listener method. The code

Spark Structured Streaming with secured Kafka throwing : Not authorized to access group exception

扶醉桌前 提交于 2020-01-24 07:43:24
问题 In order to use structured streaming in my project, I am testing spark 2.2.0 and Kafka 0.10.1 integration with Kerberos on my hortonworks 2.6.3 environment, I am running below sample code to check the integration. I am able to run the below program on IntelliJ on spark local mode with no issues, but the same program when moved to yarn cluster/client mode on Hadoop cluster it get throws below exception. I know I can configure kafka acl for group-id, but spark structured streaming generates new

Spark Structured Streaming with secured Kafka throwing : Not authorized to access group exception

自作多情 提交于 2020-01-24 07:41:33
问题 In order to use structured streaming in my project, I am testing spark 2.2.0 and Kafka 0.10.1 integration with Kerberos on my hortonworks 2.6.3 environment, I am running below sample code to check the integration. I am able to run the below program on IntelliJ on spark local mode with no issues, but the same program when moved to yarn cluster/client mode on Hadoop cluster it get throws below exception. I know I can configure kafka acl for group-id, but spark structured streaming generates new

Can I run Kafka Streams Application on the same machine as of Kafka Broker?

。_饼干妹妹 提交于 2020-01-23 13:28:09
问题 I have a Kafka Streams Application which takes data from few topics and joins the data and puts it in another topic. Kafka Configuration: 5 kafka brokers Kafka Topics - 15 partitions and 3 replication factor. Note: I am running Kafka Streams Applications on the same machines where my Kafka Brokers are running. Few millions of records are consumed/produced every hour. Whenever I take any kafka broker down, it goes into rebalancing and it takes approx. 30 minutes or sometimes even more for

Kafka having duplicate messages

旧城冷巷雨未停 提交于 2020-01-23 09:51:45
问题 I don't see any failure while producing or consuming the data however there are bunch of duplicate messages in production. For a small topic which gets around 100k messages, there are ~4k duplicates though like I said there is no failure and on top of that there is no retry logic implemented or config value is set. I also check offset values for those duplicate messages and each has distinct values which tells me that the issue is in producer. Any help would be highly appreciated 回答1: Read

KafkaConsumer Java API subscribe() vs assign()

僤鯓⒐⒋嵵緔 提交于 2020-01-22 18:58:07
问题 I am new with Kafka Java API and I am working on consuming records from a particular Kafka topic. I understand that I can use method subscribe() to start polling records from the topic. Kafka also provides method assign() if I want to start polling records from selected partitions of the topics. I want to understand if this is the only difference between the two? 回答1: Yes subscribe need group.id because each consumer in a group can dynamically set the list of topics it wants to subscribe to

Reading the same message several times from Kafka

给你一囗甜甜゛ 提交于 2020-01-21 11:55:06
问题 I use Spring Kafka API to implement Kafka consumer with manual offset management: @KafkaListener(topics = "some_topic") public void onMessage(@Payload Message message, Acknowledgment acknowledgment) { if (someCondition) { acknowledgment.acknowledge(); } } Here, I want the consumer to commit the offset only if someCondition holds. Otherwise the consumer should sleep for some time and read the same message again. Kafka Configuration: @Bean public ConcurrentKafkaListenerContainerFactory<String,

Kafka delivering duplicate message

女生的网名这么多〃 提交于 2020-01-15 03:58:08
问题 We are using kafka(0.9.0.0) for orchestrating command messages between different micro services. We are finding an intermittent issue where duplicate messages are getting delivered to a particular topic. The logs that occur when this issue happens is given below. Can some one help to understand this issue Wed, 21-Sep-2016 09:19:07 - WARNING Coordinator unknown during heartbeat -- will retry Wed, 21-Sep-2016 09:19:07 - WARNING Heartbeat failed; retrying Wed, 21-Sep-2016 09:19:07 - WARNING

Seeing “partition doesn't exist” warnings/failures after kafka using kafka partition re-assignment tool

喜欢而已 提交于 2020-01-13 10:15:10
问题 I am using kafka 0.8.1.1. I have a 3 node kafka cluster with some topics having around 5 partitions. I planned to increase the number of nodes to 5 in cluster and moving some partitions from existing topics to the new brokers. Previous partition state: broker1 : topic1 { partition 0 } broker2 : topic1 { partition 1,2} broker3 : topic1 { partition 3,4} New intended state: broker1 : topic1 { partition 0} broker2 : topic1 { partition 1} broker3 : topic1 { partition 3} broker4 : topic1 {