stream().collect(Collectors.toSet()) vs stream().distinct().collect(Collectors.toList())

If i have a list (~200 elements) of objects, with only few unique objects (~20 elements). I want to have only unique values. Between list.stream().collect(Collectors.toSet()) and list.stream().distinct().collect(Collectors.toList()) which is more efficient wrt latency and memory consumption ?

Eugene

While the answer is pretty obvious - don't bother with these details of speed and memory consumption for this little amount of elements and the fact that one returns a Set and the other a List; there are some interesting small details (interesting IMO).

Suppose you are streaming from a source that is already known to be distinct, in such a case your .distinct() operation will be a NO-OP; because there is no need to actually do anything.

If you are streaming from a List (which is by design ordered) and there are no intermediate operations (unordered for example) that change the order, .distinct() will be forced to preserve the order, by using a LinkedHashSet internally - pretty expensive.

If you are doing parallel processing, list.stream().collect(Collectors.toSet()) version will merge multiple HashSets (in 9 this has been slightly improved vs 8), .distinct() on the other hand, will spin a ConcurrentHashMap that will keep all the keys with a dummy Boolean.TRUE value (it's also doing something interesting to preserve the null that your stream might have - even this internally is handled differently in two cases)

A Set (typically HashSet) consumes more than a List (typically ArrayList), mainly because of the hashing table that it stores. But with so few elements, you will not get a noticeable difference in terms of memory consumption.
Instead, which you should care about is that these collectors return different things : a List and a Set that have their own specificities, particularly as as you access to their elements.
So use the way that matches to what you want to perform with this collection.

来源：https://stackoverflow.com/questions/48994190/stream-collectcollectors-toset-vs-stream-distinct-collectcollectors-t

标签

java-8

java-stream