问题
If i have a list (~200 elements) of objects, with only few unique objects (~20 elements).
I want to have only unique values. Between list.stream().collect(Collectors.toSet()) and list.stream().distinct().collect(Collectors.toList())
which is more efficient wrt latency and memory consumption ?
回答1:
While the answer is pretty obvious - don't bother with these details of speed and memory consumption for this little amount of elements and the fact that one returns a Set
and the other a List
; there are some interesting small details (interesting IMO).
Suppose you are streaming from a source that is already known to be distinct
, in such a case your .distinct()
operation will be a NO-OP; because there is no need to actually do anything.
If you are streaming from a List
(which is by design ordered) and there are no intermediate operations (unordered
for example) that change the order, .distinct()
will be forced to preserve the order, by using a LinkedHashSet
internally - pretty expensive.
If you are doing parallel processing, list.stream().collect(Collectors.toSet())
version will merge multiple HashSet
s (in 9 this has been slightly improved vs 8), .distinct()
on the other hand, will spin a ConcurrentHashMap
that will keep all the keys with a dummy Boolean.TRUE
value (it's also doing something interesting to preserve the null
that your stream might have - even this internally is handled differently in two cases)
回答2:
A Set
(typically HashSet
) consumes more than a List
(typically ArrayList
), mainly because of the hashing table that it stores. But with so few elements, you will not get a noticeable difference in terms of memory consumption.
Instead, which you should care about is that these collectors return different things : a List
and a Set
that have their own specificities, particularly as as you access to their elements.
So use the way that matches to what you want to perform with this collection.
来源:https://stackoverflow.com/questions/48994190/stream-collectcollectors-toset-vs-stream-distinct-collectcollectors-t