apache-kafka-streams

KStream batch process windows

六月ゝ 毕业季﹏ 提交于 2019-11-30 19:57:25
I want to batch messages with KStream interface. I have a Stream with Keys/values I tried to collect them in a tumbling window and then I wanted to process the complete window at once. builder.stream(longSerde, updateEventSerde, CONSUME_TOPIC) .aggregateByKey( HashMap::new, (aggKey, value, aggregate) -> { aggregate.put(value.getUuid, value); return aggregate; }, TimeWindows.of("intentWindow", 100), longSerde, mapSerde) .foreach((wk, values) -> { The thing is foreach gets called on each update to the KTable. I would like to process the whole window once it is complete. As in collect Data from

KafkaStreams: Getting Window Final Results

只谈情不闲聊 提交于 2019-11-30 17:33:59
问题 Is it possible to get window final result in Kafka Streams by suppressing the intermediate results. I can not achieve this goal. What is wrong with my code? val builder = StreamsBuilder() builder.stream<String,Double>(inputTopic) .groupByKey() .windowedBy(TimeWindows.of(Duration.ofSeconds(15))) .count() .suppress(Suppressed.untilWindowCloses(unbounded())) // not working) .toStream() .print(Printed.toSysOut()) It leads to this error: Failed to flush state store KSTREAM-AGGREGATE-STATE-STORE

How to add a custom StateStore to the Kafka Streams DSL processor?

守給你的承諾、 提交于 2019-11-30 10:22:53
For one of my Kafka streams apps, I need to use the features of both DSL and Processor API. My streaming app flow is source -> selectKey -> filter -> aggregate (on a window) -> sink After aggregation I need to send a SINGLE aggregated message to the sink. So I define my topology as below KStreamBuilder builder = new KStreamBuilder(); KStream<String, String> source = builder.stream(source_stream); source.selectKey(new MyKeyValueMapper()) .filterNot((k,v) -> k.equals("UnknownGroup")) .process(() -> new MyProcessor()); I define a custom StateStore and register it with my processor as below public

Handling exceptions in Kafka streams

爱⌒轻易说出口 提交于 2019-11-30 07:52:45
Had gone through multiple posts but most of them are related handling Bad messages not about exception handling while processing them. I want to know to how to handle the messages that is been received by the stream application and there is an exception while processing the message? The exception could be because of multiple reasons like Network failure, RuntimeException etc., Could someone suggest what is the right way to do? Should I use setUncaughtExceptionHandler ? or is there a better way? How to handle retries? Thanks in advance!! it depends what do you want to do with exceptions on

Kafka Streams error - Offset commit failed on partition, request timed out

怎甘沉沦 提交于 2019-11-30 04:43:47
问题 We use Kafka Streams for consuming, processing and producing messages, and on PROD env we faced with errors on multiple topics: ERROR org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=app-xxx-StreamThread-3-consumer, groupId=app] Offset commit failed on partition xxx-1 at offset 13920: The request timed out.[] These errors occur rarely for topics with small load, but for topics with high load (and spikes) errors occur dozens of times a day per topic. Topics

Correct way to restart or shutdown the stream using UncaughtExceptionHandler

纵然是瞬间 提交于 2019-11-30 04:25:31
问题 I have a stream app with below driver code for real-time message transformation. String topicName = ... KStreamBuilder builder = new KStreamBuilder(); KStream<String, String> source = builder.stream(topicName); source.transform(() -> new MyTransformer()).to(...); KafkaStreams streams = new KafkaStreams(builder, appConfig); streams.setUncaughtExceptionHandler(new Thread.UncaughtExceptionHandler() { public void uncaughtException(Thread t, Throwable e) { logger.error("UncaughtExceptionHandler "

KStream batch process windows

梦想与她 提交于 2019-11-30 03:41:09
问题 I want to batch messages with KStream interface. I have a Stream with Keys/values I tried to collect them in a tumbling window and then I wanted to process the complete window at once. builder.stream(longSerde, updateEventSerde, CONSUME_TOPIC) .aggregateByKey( HashMap::new, (aggKey, value, aggregate) -> { aggregate.put(value.getUuid, value); return aggregate; }, TimeWindows.of("intentWindow", 100), longSerde, mapSerde) .foreach((wk, values) -> { The thing is foreach gets called on each update

How to add a custom StateStore to the Kafka Streams DSL processor?

送分小仙女□ 提交于 2019-11-29 15:59:51
问题 For one of my Kafka streams apps, I need to use the features of both DSL and Processor API. My streaming app flow is source -> selectKey -> filter -> aggregate (on a window) -> sink After aggregation I need to send a SINGLE aggregated message to the sink. So I define my topology as below KStreamBuilder builder = new KStreamBuilder(); KStream<String, String> source = builder.stream(source_stream); source.selectKey(new MyKeyValueMapper()) .filterNot((k,v) -> k.equals("UnknownGroup")) .process((

How to Handle Different Timezone in Kafka Streams?

吃可爱长大的小学妹 提交于 2019-11-29 12:04:24
So I was evaluating the Kafka Streams and what it can do to see if it can fit my use case as I needed to do the aggregation of sensor's data for each 15min, Hourly, Daily and found it useful due to its Windowing feature. As I can create windows by applying windowedBy() on KGroupedStream but the problem is that windows are created in UTC and i want my data to be grouped by its originating timezone not by UTC Timezone as it hampers the aggregation so can any one help me on this. You can "shift" the timestamps using a custom TimestampExtractor -- before you write the result back into the output

Handling exceptions in Kafka streams

与世无争的帅哥 提交于 2019-11-29 10:12:35
问题 Had gone through multiple posts but most of them are related handling Bad messages not about exception handling while processing them. I want to know to how to handle the messages that is been received by the stream application and there is an exception while processing the message? The exception could be because of multiple reasons like Network failure, RuntimeException etc., Could someone suggest what is the right way to do? Should I use setUncaughtExceptionHandler ? or is there a better