kafka-consumer-api

Kafka Consumer does not receive messages

浪子不回头ぞ 提交于 2019-12-19 04:23:32
问题 I am a newbie in Kafka. I read many instructions on the Internet to make a Kafka Producer and Kafka Consumer. I did the former successfully which can send messages to Kafka cluster. However, I did not complete with the latter one. Please kindly help me to solve this problem. I saw my problem likes some posts on StackOverflow but I want to describe more clearly. I run Kafka and Zookeeper on Ubuntu server on Virtual Box. Use the simplest configuration (almost defaults) with 1 Kafka cluster and

Kafka optimal retention and deletion policy

烂漫一生 提交于 2019-12-18 12:34:15
问题 I am fairly new to kafka so forgive me if this question is trivial. I have a very simple setup for purposes of timing tests as follows: Machine A -> writes to topic 1 (Broker) -> Machine B reads from topic 1 Machine B -> writes message just read to topic 2 (Broker) -> Machine A reads from topic 2 Now I am sending messages of roughly 1400 bytes in an infinite loop filling up the space on my small broker very quickly. I'm experimenting with setting different values for log.retention.ms, log

difference between groupid and consumerid in Kafka consumer

冷暖自知 提交于 2019-12-18 11:46:40
问题 I am new to Kafka. I noticed in Consumer configuration that has two ids. one is group.id ( mandatory ) and second one is consumer.id ( non Mandatory ). Please tell why 2 Ids and difference. 回答1: Consumers groups is a Kafka abstraction that enables supporting both point-to-point and publish/subscribe messaging. A consumer can join a consumer group (let us say group_1 ) by setting its group.id to group_1 . Consumer groups is also a way of supporting parallel consumption of the data i.e.

Kafka only subscribe to latest message?

可紊 提交于 2019-12-18 09:53:40
问题 Sometimes(seems very random) Kafka sends old messages. I only want the latest messages so it overwrite messages with the same key. Currently it looks like I have multiple messages with the same key it doesn't get compacted. I use this setting in the topic: cleanup.policy=compact I'm using Java/Kotlin and Apache Kafka 1.1.1 client. Properties(8).apply { val jaasTemplate = "org.apache.kafka.common.security.scram.ScramLoginModule required username=\"%s\" password=\"%s\";" val jaasCfg = String

Kafka10.1 heartbeat.interval.ms, session.timeout.ms and max.poll.interval.ms

混江龙づ霸主 提交于 2019-12-17 17:33:50
问题 I am using kafka 0.10.1.1 and confused with the following 3 properties. heartbeat.interval.ms session.timeout.ms max.poll.interval.ms heartbeat.interval.ms - This was added in 0.10.1 and it will send heartbeat between polls. session.timeout.ms - This is to start rebalancing if no request to kafka and it gets reset on every poll. max.poll.interval.ms - This is across the poll. But, when does kafka starts rebalancing? Why do we need these 3? What are the default values for all of them? Thanks

How to make restart-able producer?

烈酒焚心 提交于 2019-12-17 16:55:13
问题 Latest version of kafka support exactly-once-semantics (EoS). To support this notion, extra details are added to each message. This means that at your consumer; if you print offsets of messages they won't be necessarily sequential. This makes harder to poll a topic to read the last committed message. In my case, consumer printed something like this Offset-0 0 Offset-2 1 Offset-4 2 Problem: In order to write restart-able proudcer; I poll the topic and read the content of last message. In this

Kafka Avro Consumer with Decoder issues

送分小仙女□ 提交于 2019-12-17 16:38:06
问题 When I attempted to run Kafka Consumer with Avro over the data with my respective schema,it returns an error of "AvroRuntimeException: Malformed data. Length is negative: -40" . I see others have had similar issues converting byte array to json, Avro write and read, and Kafka Avro Binary *coder. I have also referenced this Consumer Group Example, which have all been helpful, however no help with this error thus far.. It works up until this part of code (line 73) Decoder decoder =

How to create Custom serializer in kafka?

半城伤御伤魂 提交于 2019-12-17 16:29:07
问题 There is only few serializer available like, org.apache.kafka.common.serialization.StringSerializer org.apache.kafka.common.serialization.StringSerializer How can we create our own custom serializer ? 回答1: Here you have an example to use your own serializer/deserializer for the Kafka message value. For Kafka message key is the same thing. We want to send a serialized version of MyMessage as Kafka value and deserialize it again into a MyMessage object at consumer side. Serializing MyMessage in

Kafka: Consumer API vs Streams API

北城余情 提交于 2019-12-17 15:07:10
问题 I recently started learning Kafka and end up with these questions. What is the difference between Consumer and Stream? For me, if any tool/application consume messages from Kafka is a consumer in the Kafka world. How Stream is different as this also consumes from or produce messages to Kafka? and why is it needed as we can write our own consumer application using Consumer API and process them as needed or send them to Spark from the consumer application? I did Google on this, but did not get

Limit Kafka batches size when using Spark Streaming

荒凉一梦 提交于 2019-12-17 10:46:34
问题 Is it possible to limit the size of the batches returned by the Kafka consumer for Spark Streaming? I am asking because the first batch I get has hundred of millions of records and it takes ages to process and checkpoint them. 回答1: I think your problem can be solved by Spark Streaming Backpressure . Check spark.streaming.backpressure.enabled and spark.streaming.backpressure.initialRate . By default spark.streaming.backpressure.initialRate is not set and spark.streaming.backpressure.enabled is