Is groupBy leaking in akka-stream?

☆樱花仙子☆ 提交于 2019-12-10 15:55:54

问题


I want to write a flow on akka-stream for grouping events from infinite stream by session_uid and calculate sum of traffic for each session (details in my previous question).

I am going to use Source#groupBy function for group events by session_uid but seems like this function accumulate all group keys inside and don't have a way to release them. This is caused java.lang.OutOfMemoryError: Java heap space exception. Here is code for reproduce it:

import akka.actor.ActorSystem
import akka.stream.ActorMaterializer
import akka.stream.scaladsl.{Flow, Sink, Source}

import scala.util.Random

object GroupByMemoryLeakApplication extends App {
  implicit val system = ActorSystem()
  import system.dispatcher

  implicit val materializer = ActorMaterializer()

  val bigString = Random.nextString(512 * 1024)

  // This is infinite stream of events (i.e. this is session ids)
  val eventsSource = Source(() => (1 to 1000000000).iterator)
    .map((i) => { (i, bigString + i) })

  // This is flow pass event through groupBy function
  val groupByFlow = Flow[(Int, String)]
    .groupBy(_._2)
    .map {
      case (sessionUid, sessionEvents) =>
        sessionEvents
          .map(e => { println(e._1); e })
          .runWith(Sink.head)
    }
    .mapAsync(4)(identity)

  eventsSource
    .via(groupByFlow)
    .runWith(Sink.ignore)
    .onComplete(_ => system.shutdown())
}

So, how to release grouping key (sessionUid) inside groupBy after complete processing of related stream of events (sessionEvents)?

May be anybody known an other way for grouping events by session_uid base on akka-stream?

来源:https://stackoverflow.com/questions/33865423/is-groupby-leaking-in-akka-stream

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!