Efficiently finding the intersection of a variable number of sets of strings

不羁的心 提交于 2019-11-26 20:46:15

Set.retainAll() is how you find the intersection of two sets. If you use HashSet, then converting your ArrayLists to Sets and using retainAll() in a loop over all of them is actually O(n).

The accepted answer is just fine; as an update : since Java 8 there is a slightly more efficient way to find the intersection of two Sets.

Set<String> intersection = set1.stream()
    .filter(set2::contains)
    .collect(Collectors.toSet());

The reason it is slightly more efficient is because the original approach had to add elements of set1 it then had to remove again if they weren't in set2. This approach only adds to the result set what needs to be in there.

Strictly speaking you could do this pre Java 8 as well, but without Streams the code would have been quite a bit more laborious.

If both sets differ considerably in size, you would prefer streaming over the smaller one.

There is also the static method Sets.intersection(set1, set2) in Google Guava that returns an unmodifiable view of the intersection of two sets.

One more idea - if your arrays/sets are different sizes, it makes sense to begin with the smallest.

The best option would be to use HashSet to store the contents of these lists instead of ArrayList. If you can do that, you can create a temporary HashSet to which you add the elements to be intersected (use the putAll(..) method). Do tempSet.retainAll(storedSet) and tempSet will contain the intersection.

Sort them (n lg n) and then do binary searches (lg n).

You can use single HashSet. It's add() method returns false when the object is alredy in set. adding objects from the lists and marking counts of false return values will give you union in the set + data for histogram (and the objects that have count+1 equal to list count are your intersection). If you throw the counts to TreeSet, you can detect empty intersection early.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!