So I was evaluating the Kafka Streams and what it can do to see if it can fit my use case as I needed to do the aggregation of sensor's data for each 15min, Hourly, Daily and found it useful due to its Windowing feature.
As I can create windows by applying windowedBy()
on KGroupedStream
but the problem is that windows are created in UTC and i want my data to be grouped by its originating timezone not by UTC Timezone as it hampers the aggregation so can any one help me on this.
You can "shift" the timestamps using a custom TimestampExtractor
-- before you write the result back into the output topic, you can use a Transformer
and "shift" the timestamps back via context.forward(key, value, To.all().withTimestamps())
.
Feature request ticket: https://issues.apache.org/jira/browse/KAFKA-7911
So to solve this issue I created custom TimestampExtractor
and used it to change the streams window creation time to record time from the payload as show below.
public class RecordTimeStampExtractor implements TimestampExtractor {
@Override
public long extract(ConsumerRecord<Object, Object> record, long previousTimestamp) {
JsonObject data = (JsonObject) new JsonParser().parse(record.value().toString());
Timestamp recordTimestamp = Timestamp.valueOf(data.get(Constant.SLOT).getAsString());
return recordTimestamp.getTime();
}
}
so now I have tested it with my local timezone since yesterday which is IST 05:30 and its working fine also kafka streams are creating windows based on records timestamp. Will test with other timezone as well and update the answer
来源:https://stackoverflow.com/questions/54395471/how-to-handle-different-timezone-in-kafka-streams