Spring Cloud Stream Topic Partitions KStream Read/Write

问题

I have multiple microservices and fronted with API, like to use same topic for events each domain event on separate partition, i was able to configure spring kafka binder to send to different partition using

spring.cloud.stream.bindings.<channel>.producer.partition-key- extractor-name=

implementing PartitionKeyExtractorStrategy

my question here is can i configure Kstream binder to be able to user partition only for @input and @Output.

My understading so far is

spring.cloud.stream.kafka.streams.bindings.<channel>.producer.configuration.partitioner.class=

but it never get configured. if there is any other way or i am making mistake please suggest

回答1:

Are you deterministically sending the records to a certain partition? In other words, do you know the actual partition for each key? If you only provide the PartitionKeyExtractorStrategy, then the binder will arbitrarily pick a partition to send that record. If you want to make it deterministic, then you can provide a partitionSelectorClass as property (Implement the interface PartitionSelectorStrategy) on your producer side. This interface allows you to select a partition based on the key. Let's say you want to send all the records with key UUID-1 to partition 1and you coded that through the PartitionSelectorStrategy implementation. This then means that your kafka streams processor knows that records with key UUID-1 is coming from partition 1. With these assumptions you can do the following in your kafka streams processor. This is basically a variant of this answer provided for one of your other questions.

@StreamListener("requesti")
@SendTo("responseo")
public KStream<UUID,Account> process(KStream<UUID, Account> events) {


        return  events.transform(() -> new Transformer<UUID, Account, KeyValue<UUID, Account>>() {
            ProcessorContext context;

            @Override
            public void init(ProcessorContext context) {
                this.context = context;
            }

            @Override
            public KeyValue<UUID,Account> transform(UUID key, Account value) {
                if (this.context.partition() == 1) {
                    //your processing logic
                    return KeyValue.pair(key, value);
                }
                return null;
            }

            @Override
            public void close() {

            }
        });
    }

With the above code, you can basically filter out all the irrelevant partitions in the transform method. There is still that problem of sending data on the outbound to a particular partition. If you go with the above code as is, then the binder will send the data to arbitrary partitions (this might be a good feature to add for the binder though). However, you can directly use Kafka Streams in this case if you want the outbound records to land on deterministic partitions. See below.

@StreamListener("requesti")
public void process(KStream<UUID, Account> events) {


    final KStream<UUID, Account> transformed = events.transform(() -> new Transformer<UUID, Account, KeyValue<UUID, Account>>() {
            ProcessorContext context;


            @Override
            public void init(ProcessorContext context) {
                this.context = context;
            }

            @Override
            public KeyValue<UUID, Account> transform(UUID key, Account value) {
                if (this.context.partition() == 1) {
                    //your processing logic
                    return KeyValue.pair(key, value);
                }
                return null;
            }

            @Override
            public void close() {

            }
        });


        transformed.to("outputTopic", Produced.with(new JsonSerde<>(), new JsonSerde<>(), new CustomStreamPartitioner()));
    }

    class CustomStreamPartitioner implements StreamPartitioner<UUID, Account> {

        @Override
        public Integer partition(String topic, UUID key, Account value, int numPartitions) {
            return 1; //change to the right partition based on the key.
        }
    }

来源：https://stackoverflow.com/questions/54888311/spring-cloud-stream-topic-partitions-kstream-read-write

标签

spring-cloud-stream