问题
As far as I understand after reading Kafka Streams documentation, it's not possible to use it for streaming data from only one partition from given topic, one always have to read it whole.
Is that correct?
If so, are there any plans to provide such an option to the API in the future?
回答1:
No you can't do that because the internal consumer subscribes to the topic joining a consumer group which is specified through the application-id so the partitions are assigned automatically. Btw why do you want do that ? Without re-balancing you lose the scalability feature provided by Kafka Stream because just adding/removing instances of your streaming application you can scale the entire process, thanks to the re-balancing on partitions.
回答2:
You can do something similar to your need using PartitionGrouper. A partition grouper can be used to create a stream task based on the given topic partition.
For example refer to the DefaultPartitionGrouper implementation. But it would require customization.
Therefore as @ppatierno suggested please look into your usecase and then design the topology in a way that you do not have to deviate from a standard practice.
回答3:
You can do this by specifying the topic,partition number and offset correctly
Map(new TopicPartition(topic, partition) -> 2L)
val stream = KafkaUtils.createDirectStream[String, String](
ssc,
PreferConsistent,
Subscribe[String, String](topics, kafkaParams,offsets))
where partition refers to the Partition number,
2L refers to the starting offset of the partition
Refer streaming_from_specific_partiton for more details.
回答4:
You could not specify a partition in Kafka consumer because that is why Kafka scaling. Or we can say like this only a distributed system works. You can do segmentation and allocate each segment to a topic and each topic should have only one partition.
Since topics are registered in ZooKeeper , you might run into issues if trying to add too many of them, e.g. the case where you have a million users and have decided to create a topic per user.
来源:https://stackoverflow.com/questions/44657521/streaming-from-particular-partition-within-a-topic-kafka-streams