Example of the Scala aggregate function

前端 未结 7 944
傲寒
傲寒 2020-12-22 19:07

I have been looking and I cannot find an example or discussion of the aggregate function in Scala that I can understand. It seems pretty powerful.

Can t

7条回答
  •  眼角桃花
    2020-12-22 19:26

    The signature for a collection with elements of type A is:

    def aggregate [B] (z: B)(seqop: (B, A) ⇒ B, combop: (B, B) ⇒ B): B 
    
    • z is an object of type B acting as a neutral element. If you want to count something, you can use 0, if you want to build a list, start with an empty list, etc.
    • segop is analoguous to the function you pass to fold methods. It takes two argument, the first one is the same type as the neutral element you passed and represent the stuff which was already aggregated on previous iteration, the second one is the next element of your collection. The result must also by of type B.
    • combop: is a function combining two results in one.

    In most collections, aggregate is implemented in TraversableOnce as:

      def aggregate[B](z: B)(seqop: (B, A) => B, combop: (B, B) => B): B 
        = foldLeft(z)(seqop)
    

    Thus combop is ignored. However, it makes sense for parallel collections, becauseseqop will first be applied locally in parallel, and then combopis called to finish the aggregation.

    So for your example, you can try with a fold first:

    val seqOp = 
      (map:Map[String,Set[String]],tuple: (String,String)) => 
        map + ( tuple._1 -> ( map.getOrElse( tuple._1, Set[String]() ) + tuple._2 ) )
    
    
    list.foldLeft( Map[String,Set[String]]() )( seqOp )
    // returns: Map(one -> Set(i, 1), two -> Set(2, ii), four -> Set(iv))
    

    Then you have to find a way of collapsing two multimaps:

    val combOp = (map1: Map[String,Set[String]], map2: Map[String,Set[String]]) =>
           (map1.keySet ++ map2.keySet).foldLeft( Map[String,Set[String]]() ) { 
             (result,k) => 
               result + ( k -> ( map1.getOrElse(k,Set[String]() ) ++ map2.getOrElse(k,Set[String]() ) ) ) 
           } 
    

    Now, you can use aggregate in parallel:

    list.par.aggregate( Map[String,Set[String]]() )( seqOp, combOp )
    //Returns: Map(one -> Set(i, 1), two -> Set(2, ii), four -> Set(iv))
    

    Applying the method "par" to list, thus using the parallel collection(scala.collection.parallel.immutable.ParSeq) of the list to really take advantage of the multi core processors. Without "par", there won't be any performance gain since the aggregate is not done on the parallel collection.

提交回复
热议问题