apache-kafka-streams

Adding 'for' loop with delay in kafka streams

孤者浪人 提交于 2019-12-11 16:54:30
问题 Below is the code written to get the output as follows. my input will be in the form of JSON arrays, to seperate JSON arrays into Json objects i wrote the following code KStreamBuilder builder = new KStreamBuilder(); KStream<String, String> textlines = builder.stream("INTOPIC"); KStream<String, String> mstream = textlines .mapValues(value -> value.replace("[","" ) ) .mapValues(value -> value.replace("]","" ) ) .mapValues(value -> value.replaceAll("\\},\\{" ,"\\}\\},\\{\\{")) .flatMapValues

Partition Strategy in Kafka Stream

こ雲淡風輕ζ 提交于 2019-12-11 15:22:47
问题 Which partition strategy Kafka stream uses ? Can we change the partition strategy in Kafka Stream as we can change in normal Kafka Consumer streamsConfiguration.put(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG,Collections.singletonList(StickyAssignor.class)); makes no difference and always StreamsPartitionAssignor is used 回答1: No. You cannot set an partition assignor. Kafka Streams has very specific requirements how partition assignment works and if not done correctly, incorrect result

Why would GlobalKTable with in-memory key-value store not be filled out at restart?

左心房为你撑大大i 提交于 2019-12-11 14:15:36
问题 I am trying to figure out how GlobalKTable is working and noticed that my in memory key value store is not filled in case a restart. However the documentation sounds that it is filled in case a restart since the whole data is duplicated on clients. When I debug my application see that there is a file on /tmp/kafka-streams/category-client-1/global/.checkpoint and it is including an offset about my topic. This might be maybe necessary for stores which are persisting their data and improve

Kafka KStream application - temp file cleanup

萝らか妹 提交于 2019-12-11 13:54:26
问题 Seems that my KStream based application has been piling up many gBs of files (.sst, Log.old.<stamp>, etc). Will these get cleaned up on their own or is this something I need to keep an eye on? Some param to be set to cull them? 回答1: About these local/temp files: Some of these files are application state, and those should account for the majority of space consumed. Your application may be "piling up" many GBs of files simply because your application is actually managing a lot of state. These

how to configure two instances of Kafka StreamsBuilderFactoryBean in spring boot

狂风中的少年 提交于 2019-12-11 12:12:11
问题 Using spring-boot-2.1.3, spring-kafka-2.2.4, I want to have two streams configurations (e.g. to have different application.id, or connect to different cluster, etc). So I defined the first stream configuration pretty much according to the docs, then added a second one, with a different name, and a second StreamsBuilderFactoryBean (also with a different name): @Bean(name = KafkaStreamsDefaultConfiguration.DEFAULT_STREAMS_CONFIG_BEAN_NAME) public KafkaStreamsConfiguration kStreamsConfigs() {

Multiple @EnableBinding with Kafka Spring Cloud Stream

你。 提交于 2019-12-11 11:27:11
问题 I'm trying to set an Spring Boot Application listening to Kafka. I'm using Kafka Streams Binder. With one simple @EnableBinding @EnableBinding(StreamExample.StreamProcessor.class) public class StreamExample { @StreamListener(StreamProcessor.INPUT) @SendTo(StreamProcessor.OUTPUT) public KStream<String, String> process(KStream<String, String> input) { logger.info("Stream listening"); return input .peek(((key, value) -> logger.info("key = {} value = {}", key, value))); } interface

Does KSQL support Kafka stream Processing Guarantees?

别说谁变了你拦得住时间么 提交于 2019-12-11 11:01:48
问题 I wonder if KSQL support https://docs.confluent.io/current/streams/concepts.html#processing-guarantees Exactly_once semantic ? 回答1: KSQL is implemented on Kafka Streams API which means it provides exactly once delivery guarantee , linear scalability, fault tolerance and can run as a library without requiring a separate cluster. It is stated in Confluent's KSQL: Streaming SQL Engine for Apache Kafka. See last sentence of the abstract. 来源: https://stackoverflow.com/questions/57878221/does-ksql

Kafka Streams program is throwing exceptions when producing

江枫思渺然 提交于 2019-12-11 09:45:38
问题 I am running a program using Kafka Streams. But my program is throwing below exceptions after running some time. Please help me to slove the issue. 2017-11-21 09:58:59,947 ERROR c.e.s.c.f.k.s.p.Sample[sample-app-0.0.1-3675e4df-5e08-466e-98fe-c4f92d24df89-StreamThread-1] task [2_0] exception caught when producing org.apache.kafka.streams.errors.StreamsException: task [2_0] exception caught when producing at org.apache.kafka.streams.processor.internals.RecordCollectorImpl.checkForException

Filter Kafka Streams

假如想象 提交于 2019-12-11 09:15:16
问题 I have been checking Kafka streams. I have been testing the below code for Kafka streams Producer topic: (this is the first producer topic - which sends the below json data) KafkaProducer<String, String> producer = new KafkaProducer<>( properties); producer.send(new ProducerRecord<String,String>(topic, jsonobject.toString())); producer.close(); JSON - Producer from topic: {"UserID":"1","Address”:”XXX”,”AccountNo":"234234","MemberName”:”Stella”,”AccountType":"Savings"} Stream Topic code: (this

TopologyTestDriver sending incorrect message on KTable aggregations

邮差的信 提交于 2019-12-11 09:07:10
问题 I have a topology that aggregates on a KTable. This is a generic method I created to build this topology on different topics I have. public static <A, B, C> KTable<C, Set<B>> groupTable(KTable<A, B> table, Function<B, C> getKeyFunction, Serde<C> keySerde, Serde<B> valueSerde, Serde<Set<B>> aggregatedSerde) { return table .groupBy((key, value) -> KeyValue.pair(getKeyFunction.apply(value), value), Serialized.with(keySerde, valueSerde)) .aggregate(() -> new HashSet<>(), (key, newValue, agg) -> {