Computing set intersection in linear time?

亡梦爱人 提交于 2019-11-26 09:02:13

问题


Is there an algorithm that, given two sets, computes their intersection in linear time?

I can run two for loops to check all pairs of elements, recording elements that I find in both of the sets. However, the runninng time will be O(n2). How do I do this in O(n) time?


回答1:


That depends on your set implementation.

If you have a hash set (O(1) lookup), then the approach indicated by all the other posters is correct. Iterate across all the elements in the first set. If it's in the second set, then add it to the result. This runs in O(n) time.

If you have a tree set (O(lg n) lookup), then this approach will work, but it runs in O(n lg n) time. You can do better; there's an O(n) solution. I assume that you have some sort of iterator that can traverse the elements of the two sets in ascending order. If you do, then the question is "given two lists in sorted order, find their intersection." This can be done using a modified version of the algorithm you use to merge two ranges. The idea is to keep track of the two iterators. At each step, compare the first elements of the ranges. If they're equal, add the element to the intersection and advance both iterators forward. If the first is less than the second, then advance the first iterator. If the first element is greater, then advance the second iterator. This runs in time O(n) because each iteration consumes at least one element, and there's only O(n) elements in total.




回答2:


I wonder nobody mentioned hashtable.
Regardless of your set implementation (even if 'set' here means a simple array), you can

  1. put contents of the first set into hashtable and
  2. iterate over second set, checking if hashtable contains current element.

O(n)




回答3:


intersection(a, b):
  result = new empty set
  for x in b:
    if a contains x:
      add x to result

  return result

If the contains test is constant time (such as in a set that uses a hash table as an implementation), then this algorithm is O(n).




回答4:


Combine the two arrays and count the no of occurrences of each element in this combined array and put these in a new array. Then check this count array for entries which contain 2, those elements are in intersection of the two sets.




回答5:


For all elements in set 1: Check if that element is in set 2. You can implement a Set that has amortized O(1) lookup time.




回答6:


if one of the two lists is ordered, then we can start with the unordered list

FUNCTION: INTERSECTION ( LIST A, LIST B )
{
   CREATE C AS EMPTY LIST

   FOR EVERY: NUMBER n IN A
   {
        IF BINARY-SEARCH(n) IN B
        {
            ADD n TO C
        }
   }

   RETURN C
}

Time Complexity = O(n O(BINARY-SEARCH)) = O(n log n)

if list B is hashed, then we've BIG-THETA(C n + T(hash))

where BIG-THETA is the asymptotic average, and C is a constant and T(hash) is time needed for the hash function



来源:https://stackoverflow.com/questions/4642172/computing-set-intersection-in-linear-time

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!