一、WindowFunction
Flink提供了四种类型Window Function,其中有ReduceFunction、AggregateFunction、FlodFunction和ProcessWindowFunction。其中ReduceFunction、AggregateFunction、FlodFunction根据计算原理,属于增量聚合函数,而ProcessWindowFunction属于全量聚合函数。增量聚合函数是基于中间状态计算结果的,窗口中只维护中间状态结果值,不需要缓存原始的数据,而全量窗口函数在窗口触发时对所有的原始数据进行汇总计算,因此相对性能会较差。
- ReduceFunction 增量聚合
- AggregateFunction 增量聚合
- FlodFunction 已废弃,推荐使用AggregateFunction
- ProcessWindowFunction 全量聚合
二、ReduceFunction
x以前的聚合结果,y下一个数据,输入输出都为同类型数据
.reduce(new ReduceFunction[Sensor] {
override def reduce(x: Sensor, y: Sensor): Sensor = {
Sensor(x.id, x.timestamp + 1, y.temperature + 1)
}
三、AggregateFunction
聚合函数只传AggregateFunction参数,如果局和后海域需要的操作可以使用AggregateFunction,WindowFunction(如果需要获取上下文或操作State等可以使用RichWindowFuncton)
val pvStream = dataStream.filter(_.behavior == "pv")
.keyBy(_.itemId)
.timeWindow(Time.hours(1), Time.minutes(5))
//.aggregate(new CountAgg())
.aggregate(new CountAgg(), new WindowResult())
.keyBy(_.windowEnd)
.process(new TopHotItems(3))
// IN输入, ACC中间累加值(可以定义为任意类型), OUT输出给window函数的
class CountAgg extends AggregateFunction[UserBehavior, Long, Long] {
//初始值
override def createAccumulator(): Long = 0L
//对重分区数据merge处理
override def merge(acc: Long, acc1: Long): Long = acc + acc1
//返回值
override def getResult(acc: Long): Long = acc
//新来的一条数据如何处理
override def add(in: UserBehavior, acc: Long): Long = acc + 1
}
class WindowResult extends WindowFunction[Long, ItemViewCount, Long, TimeWindow] {
override def apply(key: Long, window: TimeWindow, input: Iterable[Long], out: Collector[ItemViewCount]): Unit = {
out.collect(ItemViewCount(key, window.getEnd, input.iterator.next()))
}
}
四、ProcessWindowFunction
示例代码,仅供参考
val pvStream = dataStream.filter(_.behavior == "pv")
.keyBy(_.itemId)
.timeWindow(Time.hours(1), Time.minutes(5))
.process(new ProcessWindowFunction[UserBehavior, Long, Long, TimeWindow] {
lazy val count: ValueState[Long] = getRuntimeContext.getState(new ValueStateDescriptor("count", classOf[Long]))
@throws[Exception]
override def process(key: Long, context: Context, elements: Iterable[UserBehavior], out: Collector[Long]): Unit = {
val preCount = count.value()
count.update(preCount + 1)
out.collect(count.value())
}
override def close(): Unit = count.clear()
})
来源:CSDN
作者:herokang
链接:https://blog.csdn.net/maomaoqiukqq/article/details/104136915