问题
I'm required to calculate median of many parameters received from a kafka stream for 15 min time window.
i couldn't find any built in function for that, but I have found a way using custom WindowFunction.
my questions are:
- is it a difficult task for flink? the data can be very large.
- if the data gets to giga bytes, will flink store everything in memory until the end of the time window? (one of the arguments of apply WindowFunction implementation is Iterable - a collection of all data which came during the time window )
thanks
来源:https://stackoverflow.com/questions/46451604/flink-calculate-median-on-stream