kafka-topic

Apache Flink - Partitioning the stream equally as the input Kafka topic

余生颓废 提交于 2021-01-29 09:46:30
问题 I would like to implement in Apache Flink the following scenario: Given a Kafka topic having 4 partitions, I would like to process the intra-partition data independently in Flink using different logics, depending on the event's type. In particular, suppose the input Kafka topic contains the events depicted in the previous images. Each event have a different structure: partition 1 has the field " a " as key, partition 2 has the field " b " as key, etc. In Flink I would like to apply different

How to delete data which already been consumed by consumer? Kafka

此生再无相见时 提交于 2021-01-28 03:51:33
问题 I am doing data replication in kafka. But, the size of kafka log file is increases very quickly. The size reaches 5 gb in a day. As a solution of this problem, ı want to delete processed data immediately. I am using delete record method in AdminClient to delete offset. But when I look at the log file, data corresponding to that offset is not deleted. RecordsToDelete recordsToDelete = RedcordsToDelete.beforeOffset(offset); TopicPartition topicPartition = new TopicPartition(topicName,partition)

How Kafka guarantee the messages order while we increase the partitions in runtime?

谁都会走 提交于 2021-01-27 16:28:25
问题 I am new to kafka and when I read the Kafka doc, I realize that messages provided with the same key will be mapped to the same partition to guarantee the order. This totally makes sense. However, I'd like to know if we increase the number of topic partitions in runtime, will the new messages with the same key be hashed to the same partition (old one) as before? If so, what if all messages are provided with keys, then none of them will be mapped to new partition? This doesn't make sense to me.

How does Zookeeper retrive the consumer offsets from __consumer_offsets topic?

亡梦爱人 提交于 2020-07-09 12:32:12
问题 This is a followup question to "Where do zookeeper store Kafka cluster and related information?" based on the answer provided by Armando Ballaci. Now it's clear that consumer offsets are stored in the Kafka cluster in a special topic called __consumer_offsets . That's fine, I am just wondering how does the retrieval of these offsets work. Topics are not like RDBS over which we can query for arbitrary data based on a certain predicate. Ex - if the data is stored in an RDBMS, probably a query

Where does zookeeper store kafka cluster and related information?

回眸只為那壹抹淺笑 提交于 2020-07-09 11:51:55
问题 By saying cluster info, I am referring to information like subscribed consumers/consumer groups read and committed offsets leaders and followers of a partition topics on the server etc. Does zookeeper keep this info in its own db (though I never heard of zookeeper having any db of its own till date) or it stores this info in the Kafka cluster on some topics etc? EDIT: and a follow up Q: How does Zookeeper retrive the consumer offsets from __consumer_offsets topic? 回答1: The ZooKeeper Data

Can Kafka Streams output topic be on a separate cluster?

北城余情 提交于 2019-12-11 01:30:06
问题 I have a topic where all logs are pushed to centralized topic but I would like to filter out some of those records to a separate topic and cluster if possible. Thanks 回答1: Kafka streams not allow to create stream with source and output topics from different Kafka clusters. So the following code will not work for you streamsBuilder.stream(sourceTopicName).filter(..).to(outputTopicName) in this case it expects that outputTopicName is from the same cluster as topic sourceTopicName. As a

Why is kafka not creating a topic? bootstrap-server is not a recognized option

萝らか妹 提交于 2019-11-30 08:45:16
I am new to Kafka and trying to create a new topic on my local machine. I am following this link . Here are the steps which i followed: Start zookeeper bin/zookeeper-server-start.sh config/zookeeper.properties Start kafka-server bin/kafka-server-start.sh config/server.properties Create a topic bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic test but when creating the topic, i am getting the following error: Exception in thread "main" joptsimple.UnrecognizedOptionException: bootstrap-server is not a recognized option at joptsimple

Why is kafka not creating a topic? bootstrap-server is not a recognized option

雨燕双飞 提交于 2019-11-29 11:34:02
问题 I am new to Kafka and trying to create a new topic on my local machine. I am following this link. Here are the steps which i followed: Start zookeeper bin/zookeeper-server-start.sh config/zookeeper.properties Start kafka-server bin/kafka-server-start.sh config/server.properties Create a topic bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic test but when creating the topic, i am getting the following error: Exception in thread "main"