scala.collection.breakOut vs views

问题

This SO answer describes how scala.collection.breakOut can be used to prevent creating wasteful intermediate collections. For example, here we create an intermediate Seq[(String,String)]:

val m = List("A", "B", "C").map(x => x -> x).toMap

By using breakOut we can prevent the creation of this intermediate Seq:

val m: Map[String,String] = List("A", "B", "C").map(x => x -> x)(breakOut)

Views solve the same problem and in addition access elements lazily:

val m = (List("A", "B", "C").view map (x => x -> x)).toMap

I am assuming the creation of the View wrappers is fairly cheap, so my question is: Is there any real reason to use breakOut over Views?

回答1:

You're going to make a trip from England to France.

With view: you're taking a set of notes in your notebook and boom, once you've called .force() you start making all of them: buy a ticket, board on the plane, ....

With breakOut: you're departing and boom, you in the Paris looking at the Eiffel tower. You don't remember how exactly you've arrived there, but you did this trip actually, just didn't make any memories.

Bad analogy, but I hope this give you a taste of what is the difference between them.

回答2:

I don't think views and breakOut are identical.

A breakOut is a CanBuildFrom implementation used to simplify transformation operations by eliminating intermediary steps. E.g get from A to B without the intermediary collection. A breakOut means letting Scala choose the appropriate builder object for maximum efficiency of producing new items in a given scenario. More details here.

views deal with a different type of efficiency, the main sale pitch being: "No more new objects". Views store light references to objects to tackle different usage scenarios: lazy access etc.

Bottom line:

If you map on a view you may still get an intermediary collection of references created before the expected result can be produced. You could still have superior performance from:

collection.view.map(somefn)(breakOut)

Than from:

collection.view.map(someFn)

回答3:

What flavian said.

One use case for views is to conserve memory. For example, if you had a million-character-long string original, and needed to use, one by one, all of the million suffixes of that string, you might use a collection of

val v = original.view
val suffixes = v.tails

views on the original string. Then you might loop over the suffixes one by one, using suffix.force() to convert them back to strings within the loop, thus only holding one in memory at a time. Of course, you could do the same thing by iterating with your own loop over the indices of the original string, rather than creating any kind of collection of the suffixes.

Another use-case is when creation of the derived objects is expensive, you need them in a collection (say, as values in a map), but you only will access a few, and you don't know which ones.

If you really have a case where picking between them makes sense, prefer breakOut unless there's a good argument for using view (like those above).

Views require more code changes and care than breakOut, in that you need to add force() where needed. Depending on context, failure to do so is often only detected at run-time. With breakOut, generally if it compiles, it's right.
In cases where view does not apply, breakOut will be faster, since view generation and forcing is skipped.
If you use a debugger, you can inspect the collection contents, which you can't meaningfully do with a collection of views.

回答4:

As of Scala 2.13, this is no longer a concern. Breakout has been removed and views are the recommended replacement.

Scala 2.13 Collections Rework

Views are also the recommended replacement for collection.breakOut. For example,

val s: Seq[Int] = ... 
val set: Set[String] = s.map(_.toString)(collection.breakOut)

can be expressed with the same performance characteristics as:

val s: Seq[Int] = ... 
val set = s.view.map(_.toString).to(Set)

来源：https://stackoverflow.com/questions/21285620/scala-collection-breakout-vs-views

标签

scala

scala-collections