Apache Kafka Streams Materializing KTables to a topic seems slow

半城伤御伤魂 提交于 2019-12-21 07:39:14

问题


I'm using kafka stream and I'm trying to materialize a KTable into a topic.

It works but it seems to be done every 30 secs or so.

How/When does Kafka Stream decides to materialize the current state of a KTable into a topic ?

Is there any way to shorten this time and to make it more "real-time" ?

Here is the actual code I'm using

// Stream of random ints: (1,1) -> (6,6) -> (3,3)
// one record every 500ms
KStream<Integer, Integer> kStream = builder.stream(Serdes.Integer(), Serdes.Integer(), RandomNumberProducer.TOPIC);

// grouping by key
KGroupedStream<Integer, Integer> byKey = kStream.groupByKey(Serdes.Integer(), Serdes.Integer());

// same behaviour with or without the TimeWindow
KTable<Windowed<Integer>, Long> count = byKey.count(TimeWindows.of(1000L),"total");

// same behaviour with only count.to(Serdes.Integer(), Serdes.Long(), RandomCountConsumer.TOPIC);
count.toStream().map((k,v) -> new KeyValue<>(k.key(), v)).to(Serdes.Integer(), Serdes.Long(), RandomCountConsumer.TOPIC);

回答1:


This is controlled by commit.interval.ms, which defaults to 30s. More details here: http://docs.confluent.io/current/streams/developer-guide.html

The semantics of caching is that data is flushed to the state store and forwarded to the next downstream processor node whenever the earliest of commit.interval.ms or cache.max.bytes.buffering (cache pressure) hits.

and here:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-63%3A+Unify+store+and+downstream+caching+in+streams



来源:https://stackoverflow.com/questions/44711499/apache-kafka-streams-materializing-ktables-to-a-topic-seems-slow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!