kafka-consumer-api

kafka-connect error: cannot find or load main class

大城市里の小女人 提交于 2019-12-08 05:46:16
问题 I am following the official docs to implement the kakf-connect to read data from a file. I have kafka running perfectly. A producer and consumer sending and receiving messages. However, when I run the following command: sudo ./bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties I am getting the following error: Error: Could not find or load main class org.apache.kafka.connect.cli.ConnectStandalone I crossed checked and I have the file

Order of messages from multiple topics Kafka

妖精的绣舞 提交于 2019-12-08 01:08:09
问题 I'm developing a software that uses Apache Kafka . I've got one consumer that subscribed to multiple topics, I'd like to know if there is an order for receiving messages from those topics. I tried some combination on my computer but I need to be sure about this. Example Consumer sub to topic1 and topic2 Producer1 write something on topic1 Producer2 write something on topic2 Producer1 write something on topic1 When the consumer polls, it receives a list of records containing first the messages

Can i consume based on specific condition in Kafka?

十年热恋 提交于 2019-12-08 01:04:40
问题 I'm writing a msg in to Kafka and consuming it in the other end. Doing some process in it and writing it back to another Kafka topic. I want to know which message response is for which request.. currently decided to capture the offset id from consumer side then write in the response and read the response payload and decide the same. For this approach we need to read each message is there any other way we can consume based on consumer config condition? 回答1: Consumers can only read the whole

Bucket records based on time(kafka-hdfs-connector)

前提是你 提交于 2019-12-07 23:27:31
问题 I am trying to copy data from Kafka into Hive tables using kafka-hdfs-connector provided by Confluent platform. While I am able to do it successfully I was wondering how to bucket the incoming data based on time interval. For example, I would like to have a new partition created every 5 minutes. I tried io.confluent.connect.hdfs.partitioner.TimeBasedPartitioner with partition.duration.ms but I think I am doing it the wrong way. I see only one partition in the Hive table with all the data

Kafka Consumer Error - xxxx nodename nor servname provided, or not known

时间秒杀一切 提交于 2019-12-07 23:23:59
问题 When running the console consumer using the following command $ ~/project/libs/kafka_2.9.2-0.8.1.1/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic customerevents --autocommit.interval.ms 100 --group customereventsgroup i get the following error Exception in thread "main" java.net.UnknownHostException: HQSML-142453: HQSML-142453: nodename nor servname provided, or not known at java.net.InetAddress.getLocalHost(InetAddress.java:1473) at kafka.consumer.ZookeeperConsumerConnector

How to control the concurrency of processing messages by ConsumerGroup

泪湿孤枕 提交于 2019-12-07 15:45:28
I am using kafka-node ConsumerGroup to consume message from a topic. The ConsumerGroup when consumes a message requires calling an external API, which might even take a second to response. I wish to control consuming next message from the queue until I get the response from the API, so that the messages are processed sequentially. How should I control this behavior? This is how we have implemented processing of 1 message at a time: var async = require('async'); //npm install async //intialize a local worker queue with concurrency as 1 (only 1 event is processed at a time) var q = async.queue

Re-processing/reading Kafka records/messages again - What is the purpose of Consumer Group Offset Reset?

隐身守侯 提交于 2019-12-07 15:40:01
问题 My kafka topic has 10 records/messages in total and 2 partitions having 5 messages each. My consumer group has 2 consumers and each of the consumer has already read 5 messages from their assigned partition respectively. Now, I want to re-process/read messages from my topic from start/beginning (offset 0). I stopped my kafka consumers and ran following command to reset consumer group offset to 0. ./kafka-consumer-groups.sh --group cg1 --reset-offsets --to-offset 0 --topic t1 --execute -

kafka new version 2.1.0 broker hangs for no reason

两盒软妹~` 提交于 2019-12-07 12:36:51
问题 At first all the brokers in cluster can start and work just fine, but sometimes one of the broker will meet problem. And there are some phenomenon will show up: whole cluster is hang, nor producer and consumer are not work, hence the network flow is down to zero from monitor; use kafka-topic.sh describe the topic message, every replica is just fine, even the exceptional brokerid, and information in zk also normal; the file-description number increase gradually on abnormal broker, which is

Request messages between two timestamps from Kafka

陌路散爱 提交于 2019-12-07 04:26:34
问题 Is it possible to consume messages from Kafka based on a time period in which the messages were ingested? Example : I want all messages ingested to a topic between 0900-1000 today (and now it's 1200). If there is only a way to specify a start time, that's fine - my consumer can stop processing messages once it reaches the end time. I can see methods for requesting messages from a given offset, and for getting the first available offset, and for the earliest available offset, but not all

How can I initialize kafka ConsmerRecords<String,String> in kafka for testing

无人久伴 提交于 2019-12-07 04:13:25
问题 I am writing test cases for kafka consumer components and mocking kafkaConsumer.poll() which returns instance of ConsumerRecords<String,String> . I want to initialize ConsumerRecords and use that in mock but the constructors of ConsumerRecords expect actual kafka topic which I don't have in tests. One way I think for this is by keeping a serialized copy of object and deserialize to initialize ConsumerRecords . Is there any other way to achieve the same. 回答1: Here is some example code (Kafka