I can see that from Scala documentation scala.collection.immutable.Set is only a trait. Which one on the Set implementation is used by default ? HashSet or TreeSet (or somet
By looking at the source code, you can find that sets up to four elements have an optimized implementation provided by EmptySet, Set1, Set2, Set3 and Set4, which simply hold the single values.
For example here's Set2 declaration (as of scala 2.11.4):
class Set2[A] private[collection] (elem1: A, elem2: A) extends AbstractSet[A] with Set[A] with Serializable
And here's the contains implementation:
def contains(elem: A): Boolean =
elem == elem1 || elem == elem2
or the find implementation
override def find(f: A => Boolean): Option[A] = {
if (f(elem1)) Some(elem1)
else if (f(elem2)) Some(elem2)
else None
}
Very straightforward.
For sets with more than 4 elements, the underlying implementation is an HashSet. We can easily verify this in the REPL:
scala> Set(1, 2, 3, 4).getClass
res1: Class[_ <: scala.collection.immutable.Set[Int]] = class scala.collection.immutable.Set$Set4
scala> Set(1, 2, 3, 4, 5, 6).getClass
res0: Class[_ <: scala.collection.immutable.Set[Int]] = class scala.collection.immutable.HashSet$HashTrieSet
That being said, find must always iterate over the whole HashSet, since it's unsorted, so it will be O(n).
Conversely, a lookup operation like contains will be O(1) instead.
Here's a more in-depth reference about performance of scala collections in general.
Speaking of Map, pretty much the same concepts apply. There are optimized Map implementations up to 4 elements, and then it's an HashMap.