apache-kafka

How does max.poll.records affect the consumer poll

ぃ、小莉子 提交于 2021-02-20 18:50:29
问题 max.poll.records has been recently changed to 500 in kafka consumer config, but I am wondering how does this affect the consumer poll. Is it just an upper bound of the maximum number of records that could be fetched or the consumer waits till it gets 500 records. 回答1: max.poll.records : Yes from new consumer this property is changed to 500 by default which means consumer can poll minimum 1 to max 500 records for each poll, and which means consumer will not wait when partition did not have

How to display topics using Kafka Clients in Java?

筅森魡賤 提交于 2021-02-20 04:55:27
问题 public static void main(String [] args){ Properties config = new Properties(); config.put(AdminClientConfig.BOOTSRAP._SERVERS_CONFIG, "mybroker.ip.address:9092"); AdminClient admin = AdminClient.create(config); ListTopicsResult ltr = admin.listTopics().names().get(); } I am catching an ExecutionException with the error messages: org.apache.kafka.common.errors.TimeoutException: Call(callName:listTopics, deadlineMs=1599813311360, tries=1, nextAllowedTryMs=-9223372034707292162) timed out at

Spark Streaming not reading from Kafka topics

孤街醉人 提交于 2021-02-20 02:49:25
问题 I have set up Kafka and Spark on Ubuntu. I am trying to read kafka topics through Spark Streaming using pyspark(Jupyter notebook). Spark is neither reading the data nor throwing any error. Kafka producer and consumer are communicating with each other on terminal. Kafka is configured with 3 partitions on port 9092,9093,9094. Messages are getting stored in kafka topics. Now, I want to read it through Spark Streaming. I am not sure what I am missing. Even I have explored it on internet, but

Not able to run Kafka Connect in distributed mode - Error while attempting to create/ find topics 'connect-offsets'

微笑、不失礼 提交于 2021-02-19 06:23:30
问题 [2017-08-31 10:15:20,715] WARN The configuration 'internal.key.converter' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:231) [2017-08-31 10:15:20,715] WARN The configuration 'status.storage.replication.factor' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:231) [2017-08-31 10:15:20,715] WARN The configuration 'internal.value.converter.schemas.enable' was supplied but isn't a known config. (org.apache.kafka

Kafka ignoring `transaction.timeout.ms` for producer

瘦欲@ 提交于 2021-02-19 04:46:25
问题 I configure the producer to 10-second timeout using the transaction.timeout.ms property. However, it seems that the transaction is aborted after 60 seconds, which is much longer. See the following program: Properties properties = new Properties(); properties.setProperty("bootstrap.servers", brokerConnectionString); properties.setProperty("transactional.id", "my-transactional-id"); properties.setProperty("transaction.timeout.ms", "5000"); // start the first producer and write one event

Periodic NPE In Kafka Streams Processor Context

会有一股神秘感。 提交于 2021-02-19 04:07:35
问题 Using kafka-streams 0.10.0.0, I am periodically seeing a null pointer exception in the StreamTask when forwarding a message. It varies between 10% to 50% of the invocations. The NPE occurs in this method: public <K, V> void forward(K key, V value) { ProcessorNode thisNode = currNode; try { for (ProcessorNode childNode : (List<ProcessorNode<K, V>>) thisNode.children()) { currNode = childNode; childNode.process(key, value); } } finally { currNode = thisNode; } } It seems that in some cases, the

How to programmatically get schema from confluent schema registry in Python

拥有回忆 提交于 2021-02-19 03:43:06
问题 As of now i am doing something like this reading avsc file to get schema value_schema = avro.load('client.avsc') can i do something to get schema from confluent schema registry using topic-name? i found one way but didn't figure out how to use it. https://github.com/marcosschroh/python-schema-registry-client 回答1: Using confluent-kafka-python from confluent_kafka.avro.cached_schema_registry_client import CachedSchemaRegistryClient sr = CachedSchemaRegistryClient({ 'url': 'http://localhost:8081

Commit Asynchronously a message just after reading from topic

笑着哭i 提交于 2021-02-19 03:25:08
问题 I'm trying to commit a message just after reading it from the topic. I've followed this link (https://www.confluent.io/blog/apache-kafka-spring-boot-application) to create a Kafka consumer with spring. Normally it works perfect and the consumer gets the message and waits till anotherone enters in the queue. But the problem is that when I process this messages it takes a lot of time (circa 10 minutes) the kafka queue thinks that the message is not consumed (commited) and the consumers reads it

Kafka JDBC Sink Connector, insert values in batches

谁都会走 提交于 2021-02-18 18:50:35
问题 I receive a lot of the messages (by http-protocol) per second (50000 - 100000) and want to save them to PostgreSql. I decided to use Kafka JDBC Sink for this purpose. The messages are saved to database by one record, not in batches. I want to insert records in PostgreSQL in batches with size 500-1000 records. I found some answers on this problem in issue: How to use batch.size? I tried to use related options in configuration, but it seems that they no have any effect. My Kafka JDBC Sink

Kafka JDBC Sink Connector, insert values in batches

巧了我就是萌 提交于 2021-02-18 18:49:45
问题 I receive a lot of the messages (by http-protocol) per second (50000 - 100000) and want to save them to PostgreSql. I decided to use Kafka JDBC Sink for this purpose. The messages are saved to database by one record, not in batches. I want to insert records in PostgreSQL in batches with size 500-1000 records. I found some answers on this problem in issue: How to use batch.size? I tried to use related options in configuration, but it seems that they no have any effect. My Kafka JDBC Sink