apache-kafka-streams

Is it possible to access message headers with Kafka Streams?

谁都会走 提交于 2019-12-04 00:45:46
问题 With the addition of Headers to the records (ProducerRecord & ConsumerRecord) in Kafka 0.11, is it possible to get these headers when processing a topic with Kafka Streams? When calling methods like map on a KStream it provides arguments of the key and the value of the record but no way I can see to access the headers . It would be nice if we could just map over the ConsumerRecord s. ex. KStreamBuilder kStreamBuilder = new KStreamBuilder(); KStream<String, String> stream = kStreamBuilder

Kafka Streams with state stores - Reprocessing of messages on app restart

感情迁移 提交于 2019-12-03 21:41:44
We have the following topology with two transformers, and each transformer uses persistent state store: kStreamBuilder.stream(inboundTopicName) .transform(() -> new FirstTransformer(FIRST_STATE_STORE), FIRST_STATE_STORE) .map((key, value) -> ...) .transform(() -> new SecondTransformer(SECOND_STATE_STORE), SECOND_STATE_STORE) .to(outboundTopicName); and Kafka settings has auto.offset.reset: latest . After app was launched, I see two internal compacted topics were creates (and it's expected): appId_inbound_firstStateStore-changelog and appId_inbound_secondStateStore-changelog Our app was down

How to register a stateless processor (that seems to require a StateStore as well)?

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-03 16:15:44
I'm building a topology and want to use KStream.process() to write some intermediate values to a database. This step doesn't change the nature of the data and is completely stateless. Adding a Processor requires to create a ProcessorSupplier and pass this instance to the KStream.process() function along with the name of a state store. This is what I don't understand. How to add a StateStore object to a topology since it requires a StateStoreSupplier ? Failing to add a said StateStore gives this error when the application is started: Exception in thread "main" org.apache.kafka.streams.errors

Event sourcing with Kafka streams

妖精的绣舞 提交于 2019-12-03 16:14:01
I'm trying to implement a simple CQRS/event sourcing proof of concept on top of Kafka streams (as described in https://www.confluent.io/blog/event-sourcing-using-apache-kafka/ ) I have 4 basic parts: commands topic, which uses the aggregate ID as the key for sequential processing of commands per aggregate events topic, to which every change in aggregate state are published (again, key is the aggregate ID). This topic has a retention policy of "never delete" A KTable to reduce aggregate state and save it to a state store events topic stream -> group to a Ktable by aggregate ID -> reduce

Streaming messages to multiple topics

断了今生、忘了曾经 提交于 2019-12-03 14:45:29
I have a single master topic and multiple predicates each of which has an output topic associated with it. I want to send each record to ALL topics that whose predicate resolves to true. I am using Luwak to test which predicates a record satisfies (to use this library you evaluate a document with a list of predicates and it tells you which ones matched - i.e. I only call it once to get the list of satisfied predicates). I am trying to use Kafka Streams for this but there doesn't seem to be the appropriate method on KStream (KStream#branch only routes a record to a single topic). One possible

Spring Kafka - Consume last N messages for partitions(s) for any topic

元气小坏坏 提交于 2019-12-03 13:55:10
问题 I'm trying to read the requested no of kafka messages. For non transactional messages we would seek from endoffset - N for M partitions start polling and collect messages where current offset is less than end offset for each partitions. For idempotent/transactional messages we have to account for transaction markers/duplicate messages and meaning offsets will not be continuous, in such case endoffset - N will not return N messages and we would need go back and seek for more messages until we

Kafka Streams API: KStream to KTable

南笙酒味 提交于 2019-12-03 08:14:47
问题 I have a Kafka topic where I send location events (key=user_id, value=user_location). I am able to read and process it as a KStream : KStreamBuilder builder = new KStreamBuilder(); KStream<String, Location> locations = builder .stream("location_topic") .map((k, v) -> { // some processing here, omitted form clarity Location location = new Location(lat, lon); return new KeyValue<>(k, location); }); That works well, but I'd like to have a KTable with the last known position of each user. How

Ideal way to enrich a KStream with lookup data

て烟熏妆下的殇ゞ 提交于 2019-12-03 06:34:44
My stream has a column called 'category' and I have additional static metadata for each 'category' in a different store, it gets updated once every couple of days. What is the right way to do this lookup? There are two options with Kafka streams Load static data outside of Kafka Streams and just use KStreams#map() to add metadata. This is possible as Kafka Streams is just a library. Load the metadata to a Kafka topic, load it to a KTable and do KStreams#leftJoin() , this seems more natural and leaves partitioning etc to Kafka Streams. However, this requires us to keep the KTable loaded with

Spring Kafka - Consume last N messages for partitions(s) for any topic

别来无恙 提交于 2019-12-03 03:47:31
I'm trying to read the requested no of kafka messages. For non transactional messages we would seek from endoffset - N for M partitions start polling and collect messages where current offset is less than end offset for each partitions. For idempotent/transactional messages we have to account for transaction markers/duplicate messages and meaning offsets will not be continuous, in such case endoffset - N will not return N messages and we would need go back and seek for more messages until we have N messages for each partitions or beginning offset is reached As there are multiple partitions I

Kafka Streams - Send on different topics depending on Streams Data

情到浓时终转凉″ 提交于 2019-12-03 00:43:37
I have a kafka streams application waiting for records to be published on topic user_activity . It will receive json data and depending on the value of against a key I want to push that stream into different topics. This is my streams App code: KStream<String, String> source_user_activity = builder.stream("user_activity"); source_user_activity.flatMapValues(new ValueMapper<String, Iterable<String>>() { @Override public Iterable<String> apply(String value) { System.out.println("value: " + value); ArrayList<String> keywords = new ArrayList<String>(); try { JSONObject send = new JSONObject();