apache-kafka | 易学教程

How does max.poll.records affect the consumer poll

阅读更多关于 How does max.poll.records affect the consumer poll

问题 max.poll.records has been recently changed to 500 in kafka consumer config, but I am wondering how does this affect the consumer poll. Is it just an upper bound of the maximum number of records that could be fetched or the consumer waits till it gets 500 records. 回答1: max.poll.records : Yes from new consumer this property is changed to 500 by default which means consumer can poll minimum 1 to max 500 records for each poll, and which means consumer will not wait when partition did not have

How to display topics using Kafka Clients in Java?

阅读更多关于 How to display topics using Kafka Clients in Java?

问题 public static void main(String [] args){ Properties config = new Properties(); config.put(AdminClientConfig.BOOTSRAP._SERVERS_CONFIG, "mybroker.ip.address:9092"); AdminClient admin = AdminClient.create(config); ListTopicsResult ltr = admin.listTopics().names().get(); } I am catching an ExecutionException with the error messages: org.apache.kafka.common.errors.TimeoutException: Call(callName:listTopics, deadlineMs=1599813311360, tries=1, nextAllowedTryMs=-9223372034707292162) timed out at

Spark Streaming not reading from Kafka topics

阅读更多关于 Spark Streaming not reading from Kafka topics

问题 I have set up Kafka and Spark on Ubuntu. I am trying to read kafka topics through Spark Streaming using pyspark(Jupyter notebook). Spark is neither reading the data nor throwing any error. Kafka producer and consumer are communicating with each other on terminal. Kafka is configured with 3 partitions on port 9092,9093,9094. Messages are getting stored in kafka topics. Now, I want to read it through Spark Streaming. I am not sure what I am missing. Even I have explored it on internet, but

Not able to run Kafka Connect in distributed mode - Error while attempting to create/ find topics 'connect-offsets'

阅读更多关于 Not able to run Kafka Connect in distributed mode - Error while attempting to create/ find topics 'connect-offsets'

问题 [2017-08-31 10:15:20,715] WARN The configuration 'internal.key.converter' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:231) [2017-08-31 10:15:20,715] WARN The configuration 'status.storage.replication.factor' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:231) [2017-08-31 10:15:20,715] WARN The configuration 'internal.value.converter.schemas.enable' was supplied but isn't a known config. (org.apache.kafka

Kafka ignoring `transaction.timeout.ms` for producer

阅读更多关于 Kafka ignoring `transaction.timeout.ms` for producer

问题 I configure the producer to 10-second timeout using the transaction.timeout.ms property. However, it seems that the transaction is aborted after 60 seconds, which is much longer. See the following program: Properties properties = new Properties(); properties.setProperty("bootstrap.servers", brokerConnectionString); properties.setProperty("transactional.id", "my-transactional-id"); properties.setProperty("transaction.timeout.ms", "5000"); // start the first producer and write one event

Periodic NPE In Kafka Streams Processor Context

阅读更多关于 Periodic NPE In Kafka Streams Processor Context

问题 Using kafka-streams 0.10.0.0, I am periodically seeing a null pointer exception in the StreamTask when forwarding a message. It varies between 10% to 50% of the invocations. The NPE occurs in this method: public <K, V> void forward(K key, V value) { ProcessorNode thisNode = currNode; try { for (ProcessorNode childNode : (List<ProcessorNode<K, V>>) thisNode.children()) { currNode = childNode; childNode.process(key, value); } } finally { currNode = thisNode; } } It seems that in some cases, the

How to programmatically get schema from confluent schema registry in Python

阅读更多关于 How to programmatically get schema from confluent schema registry in Python

问题 As of now i am doing something like this reading avsc file to get schema value_schema = avro.load('client.avsc') can i do something to get schema from confluent schema registry using topic-name? i found one way but didn't figure out how to use it. https://github.com/marcosschroh/python-schema-registry-client 回答1: Using confluent-kafka-python from confluent_kafka.avro.cached_schema_registry_client import CachedSchemaRegistryClient sr = CachedSchemaRegistryClient({ 'url': 'http://localhost:8081

Commit Asynchronously a message just after reading from topic

阅读更多关于 Commit Asynchronously a message just after reading from topic

问题 I'm trying to commit a message just after reading it from the topic. I've followed this link (https://www.confluent.io/blog/apache-kafka-spring-boot-application) to create a Kafka consumer with spring. Normally it works perfect and the consumer gets the message and waits till anotherone enters in the queue. But the problem is that when I process this messages it takes a lot of time (circa 10 minutes) the kafka queue thinks that the message is not consumed (commited) and the consumers reads it

Kafka JDBC Sink Connector, insert values in batches

阅读更多关于 Kafka JDBC Sink Connector, insert values in batches

问题 I receive a lot of the messages (by http-protocol) per second (50000 - 100000) and want to save them to PostgreSql. I decided to use Kafka JDBC Sink for this purpose. The messages are saved to database by one record, not in batches. I want to insert records in PostgreSQL in batches with size 500-1000 records. I found some answers on this problem in issue: How to use batch.size? I tried to use related options in configuration, but it seems that they no have any effect. My Kafka JDBC Sink

Kafka JDBC Sink Connector, insert values in batches

阅读更多关于 Kafka JDBC Sink Connector, insert values in batches