kafka-consumer-api

Confluent Kafka Consumer Configuration - How session.timeout.ms and max.poll.interval.ms are related?

痴心易碎 提交于 2019-12-11 05:25:16
问题 I'm trying to understand how the default values of below two confluent consumer configurations work together. max.poll.interval.ms - As per confluent documentation, the default value is 300,000 ms session.timeout.ms - As per confluent documentation, the default value is 10,000 ms heartbeat.interval.ms - As per confluent documentation, the default value is 3,000 ms Let's say if I'm using these default values in my configuration. Now I've a question here. For example, let's assume for a

Kafka - Multiple Consumers From Same Group Assigned Same Partition

↘锁芯ラ 提交于 2019-12-11 04:44:02
问题 I have posted this over at the Kafka User mailing list but didn't get any responses so I figured I would try here as well. I am currently attempting to upgrade my software to use Kafka 0.9 from 0.8.2. I am trying to switch over to the new Consumer API to allow for rebalancing as machines are added or removed from our cluster. I am running into an issue where the same partition on a topic is being assigned to multiple consumers for a short period of time when a machine is added to the group.

How do I transform/fork a Kafka stream and send it over to a specific topic?

痴心易碎 提交于 2019-12-11 04:18:59
问题 I am Trying to transform the string value obtained in my original stream "textlines" into JSONObject Messages using the function "mapValues" into newStream. Then stream whatever I get in newStream onto a topic called "testoutput". But everytime a message actually goes through the transformation block I get a NullPointerException with errors pointing only into kafka stream libraries. Have no idea what is going on :(( P.S. When I fork/create a new kafka stream from the original stream, does the

Control enabling/disabling Kafka consumers in Spring Boot

我是研究僧i 提交于 2019-12-11 03:04:47
问题 I have configured several Kafka consumers in Spring Boot. This is what the kafka.properties looks like (only listing config for one consumer here): kafka.topics= bootstrap.servers= group.id= enable.auto.commit= auto.commit.interval.ms= session.timeout.ms= schema.registry.url= auto.offset.reset= kafka.enabled= Here is the config: @Configuration @PropertySource({"classpath:kafka.properties"}) public class KafkaConsumerConfig { @Autowired private Environment env; @Bean public ConsumerFactory

Apache Kafka: …StringDeserializer is not an instance of …Deserializer

て烟熏妆下的殇ゞ 提交于 2019-12-11 02:22:38
问题 In my simple application i am trying to instantiate a KafkaConsumer my code is nearly a copy of the code from javadoc ("Automatic Offset Committing"): @Slf4j public class MyKafkaConsumer { public MyKafkaConsumer() { Properties props = new Properties(); props.put("bootstrap.servers", "localhost:9092"); props.put("group.id", "test"); props.put("enable.auto.commit", "true"); props.put("auto.commit.interval.ms", "1000"); props.put("key.deserializer", "org.apache.kafka.common.serialization

Trigger a spark job whenever an event occurs

元气小坏坏 提交于 2019-12-11 02:12:21
问题 I have a spark application which should run whenever it receives a kafka message on a topic. I won't be receiving more than 5-6 messages a day so I don't want to take spark streaming approach. Instead I tried to submit the application using SparkLauncher but I don't like the approach as I have to set spark and Java classpath programmatically within my code along with all the necessary spark properties like executor cores, executor memory etc. How do I trigger the spark application to run from

Syntax error in CQL query when trying to write to cassandra from python

喜你入骨 提交于 2019-12-11 00:35:53
问题 So, I am building an application in python that takes data from twitter and then saves it to cassandra. My current problems lies in a script which reads data from kafka and tries to write it to cassandra, as follows: import threading, logging, time import multiprocessing from cassandra.cluster import Cluster from kafka import KafkaConsumer, KafkaProducer class Consumer(multiprocessing.Process): def __init__(self): multiprocessing.Process.__init__(self) self.stop_event = multiprocessing.Event(

Kafka offset management

∥☆過路亽.° 提交于 2019-12-11 00:15:39
问题 We are using Kafka 0.10... I'm seeing some conflicting information online (and in documentation) regarding how offsets are managed in kafka when enable.auto.commit is TRUE. Does the same poll() method that retrieves messages also handle the commits at the configured intervals? If i retrieve messages from poll in a single threaded application, process the messages to completion (including handling errors) in the SAME thread, meaning poll() will not be invoked again until after my processing is

Consuming from a replica

旧街凉风 提交于 2019-12-10 21:21:11
问题 Kafka replicates each partition of a topic up-to specified replication factor. As far as I know, all write and read requests are routed to the leader of the partition. Is there any way to consume from the followers not from the leader? Is the replication in Kafka only for fail-over? 回答1: In Kafka 2.3 and older, you can only consume from the leader -- this is by design. Replication is for fault-tolerance only. If leader fails, the followers will elect a new leader. Have a look at this blog

org.apache.kafka.common.KafkaException: Failed to construct kafka consumer

守給你的承諾、 提交于 2019-12-10 14:41:06
问题 I am manually starting Zookeeper, then Kafka server and finally the Kafka-Rest server with their respective properties file. Next, I am deploying my Spring Boot application on tomcat In the Tomcat log trace, I am getting the Error org.springframework.context.ApplicationContextException: Failed to start bean 'org.springframework.kafka.config.internalKafkaListenerEndpointRegistry'; nested exception is org.apache.kafka.common.KafkaException: Failed to construct kafka consumer and my application