Simplest way to get the top n elements of a Scala Iterable

前端 未结 9 696
佛祖请我去吃肉
佛祖请我去吃肉 2020-11-29 02:48

Is there a simple and efficient solution to determine the top n elements of a Scala Iterable? I mean something like

iter.toList.sortBy(_.myAttr).take(2)
         


        
9条回答
  •  星月不相逢
    2020-11-29 03:17

    Here's another solution that is simple and has pretty good performance.

    def pickTopN[T](k: Int, iterable: Iterable[T])(implicit ord: Ordering[T]): Seq[T] {
      val q = collection.mutable.PriorityQueue[T](iterable.toSeq:_*)
      val end = Math.min(k, q.size)
      (1 to end).map(_ => q.dequeue())
    }
    

    The Big O is O(n + k log n), where k <= n. So the performance is linear for small k and at worst n log n.

    The solution can also be optimized to be O(k) for memory but O(n log k) for performance. The idea is to use a MinHeap to track only the top k items at all times. Here's the solution.

    def pickTopN[A, B](n: Int, iterable: Iterable[A], f: A => B)(implicit ord: Ordering[B]): Seq[A] = {
      val seq = iterable.toSeq
      val q = collection.mutable.PriorityQueue[A](seq.take(n):_*)(ord.on(f).reverse) // initialize with first n
    
      // invariant: keep the top k scanned so far
      seq.drop(n).foreach(v => {
        q += v
        q.dequeue()
      })
    
      q.dequeueAll.reverse
    }
    

提交回复
热议问题