问题
I have a data stream on Kafka that I stream as a Kstream. Next to it I have a meta data stream that I would like to enrich the data stream with. A fairly common scenario present in several examples.
What I haven't solved is when the meta data stream contains more than one result for the specified window. What is commonly wanted in this scenario is to join it with the latest, or last, element from the meta data stream. A sales order would for example be materialised once, with the latest customer object, not twice for each sequential customer update.
Imagine the following scenario:
When element 7 (green) arrives it gets joined with 2 and 3 from the meta data stream, even though only 3 is relevant (in my case).
I realise this could be a good match for a Kstream<-Ktable join, where the Ktable only would contain the latest record in the meta data stream. But that has the huge disadvantage in that it will not cope with late and out-of-order data in a good fashion.
The question boils down to: How do I join a Kstream with another Kstream, but only with the latest event in the latter?
来源:https://stackoverflow.com/questions/47495299/left-joining-a-kstream-on-another-kstream-but-only-with-latest-results