apache-kafka-streams | 易学教程

Left joining a KStream on another Kstream, but only with “latest” results

阅读更多关于 Left joining a KStream on another Kstream, but only with “latest” results

问题 I have a data stream on Kafka that I stream as a Kstream. Next to it I have a meta data stream that I would like to enrich the data stream with. A fairly common scenario present in several examples. What I haven't solved is when the meta data stream contains more than one result for the specified window. What is commonly wanted in this scenario is to join it with the latest, or last, element from the meta data stream. A sales order would for example be materialised once, with the latest

Kafka Streams failed to delete the state directory - DirectoryNotEmptyException

阅读更多关于 Kafka Streams failed to delete the state directory - DirectoryNotEmptyException

问题 I noticed the exception stream-thread [x-CleanupThread] Failed to delete the state directory with our kafka streams application. Application uses a windowed state store and is defined as: Stores.windowStoreBuilder( Stores.persistentWindowStore( storeName, retentionPeriod, retentionWindowSize, false), Serdes.String(), Serdes.String()).withCachingEnabled(); This is not a test issue using topology driver. This is in actual deployed stream application. Every ten minutes it will try to delete the

Issue with Kafka stream filtering

阅读更多关于 Issue with Kafka stream filtering

问题 I'm trying to run a basic app from the following example: https://github.com/confluentinc/examples/blob/3.3.x/kafka-streams/src/main/scala/io/confluent/examples/streams/MapFunctionScalaExample.scala However I'm getting an exception at this line: // Variant 1: using `mapValues` val uppercasedWithMapValues: KStream[Array[Byte], String] = textLines.mapValues(_.toUpperCase()) Error:(33, 25) missing parameter type for expanded function ((x$1) => x$1.toUpperCase()) textLines.mapValues(_.toUpperCase

How do I transform/fork a Kafka stream and send it over to a specific topic?

阅读更多关于 How do I transform/fork a Kafka stream and send it over to a specific topic?

问题 I am Trying to transform the string value obtained in my original stream "textlines" into JSONObject Messages using the function "mapValues" into newStream. Then stream whatever I get in newStream onto a topic called "testoutput". But everytime a message actually goes through the transformation block I get a NullPointerException with errors pointing only into kafka stream libraries. Have no idea what is going on :(( P.S. When I fork/create a new kafka stream from the original stream, does the

How to run two or more topologies with the same APPLICATION_ID_CONFIG?

阅读更多关于 How to run two or more topologies with the same APPLICATION_ID_CONFIG?

问题 I want to run 2 topologies on same instance. 1 topology involves state store and other involves global store. How do I do this succesfully? I have created 1 topic with 3 partitions and then added a state store in 1 topology and global store in 2nd topology. Topology 1 : public void createTopology() { Topology topology = new Topology(); topology.addSource("source", new KeyDeserializer(), new ValueDeserializer(), "topic1"); topology.addProcessor("processor1", new CustomProcessorSupplier1(),

Kafka stream processor thread safe?

阅读更多关于 Kafka stream processor thread safe?

问题 I know this question was asked before here: Kafka Streaming Concurrency? But yet this is very strange to me. According to the documentation (or maybe I am missing something) each partition has a task meaning different instance of processors and each task is being execute by different thread. But when I tested it, I saw that different threads can get different instances of processor. Therefore if you want to keep any in memory state (old fashioned way) in your processor you must lock? Example

Can Kafka Streams output topic be on a separate cluster?

阅读更多关于 Can Kafka Streams output topic be on a separate cluster?

问题 I have a topic where all logs are pushed to centralized topic but I would like to filter out some of those records to a separate topic and cluster if possible. Thanks 回答1: Kafka streams not allow to create stream with source and output topics from different Kafka clusters. So the following code will not work for you streamsBuilder.stream(sourceTopicName).filter(..).to(outputTopicName) in this case it expects that outputTopicName is from the same cluster as topic sourceTopicName. As a

Set timestamp in output with Kafka Streams fails for transformations

阅读更多关于 Set timestamp in output with Kafka Streams fails for transformations

问题 Suppose we have a transformer (written in Scala) new Transformer[String, V, (String, V)]() { var context: ProcessorContext = _ override def init(context: ProcessorContext): Unit = { this.context = context } override def transform(key: String, value: V): (String, V) = { val timestamp = toTimestamp(value) context.forward(key, value, To.all().withTimestamp(timestamp)) key -> value } override def close(): Unit = () } where toTimestamp is just a function which returns an a timestamp fetched from

Streaming from particular partition within a topic (Kafka Streams)

阅读更多关于 Streaming from particular partition within a topic (Kafka Streams)

问题 As far as I understand after reading Kafka Streams documentation, it's not possible to use it for streaming data from only one partition from given topic, one always have to read it whole. Is that correct? If so, are there any plans to provide such an option to the API in the future? 回答1: No you can't do that because the internal consumer subscribes to the topic joining a consumer group which is specified through the application-id so the partitions are assigned automatically. Btw why do you

kafka streaming : java.nio.file.DirectoryNotEmptyException

阅读更多关于 kafka streaming : java.nio.file.DirectoryNotEmptyException

问题 We have issue with deleting the state directory within Kafka streaming application. We are running application on inhouse container platform. Insight into this issue will be much appreciated. The log of the exception: 2018-09-18 09:26:09.112 INFO 1 --- [5-CleanupThread] o.a.k.s.p.internals.StateDirectory : stream-thread [ApplicationName-1ae22d38-32d3-451a-b039-372c79b2e6a5-CleanupThread] Deleting obsolete state directory 2_1 for task 2_1 as 601112ms has elapsed (cleanup delay is 600000ms).