An algorithm to find the difference of two set A and B with size n

流过昼夜 提交于 2020-08-08 18:20:26

问题


There are two set A and B, and the size of both sets is n. How to find every elements of A that is not in B (A-B), with O(n). What data structure should I use (bloom filter?)


回答1:


Given that both are sets, you should use a set / hashset. This will let you compute the contains / in operation in O(1). Bloom filters aren't good for this type of problem - they tell you if an element definitely isn't in a set of objects, but there are still chances for false positives. You're better off using a regular hashset since you want an exact answer.

Given two sets you can compute the set difference in O(min(|A|, |B|)).

If A is the smaller set you can loop through all elements in A and discard the ones that are present in B.

If B is the smaller set you can loop through all the elements in B and discard (from set A) any one you find in A.




回答2:


Here is one way to calculate the set difference in O(n) time complexity (and O(n) space complexity) without using a fancy data structure other then a set. I assume that the sets A and B have the ability to test for membership in O(1) time (as is typical of most HashSet implementations). This algorithm also does not require sets A and B to be modified.

Algorithm Pseudocode

Goal: Calculate (A-B)
Input: Set A, Set B;
BEGIN:
    Create Empty Set C to contain (A-B).
    for each element a in Set A:
        if a does not exist in Set B:
            Add a to Set C;
    Return Set C;
END;

Time Complexity:

This runs in O(n) time complexity because you only have to iterate through all n elements of Set A once. And for each of the n elements, you test Set B for membership in O(1) time. That yields an O(n) runtime for the algorithm.

Space Complexity:

The space complexity is O(n) because a new set C is used that will store up to all n elements in the solution.

Java Sample Implementation

import java.util.HashSet;

public class Tester {

    public static HashSet<String> setDifference(HashSet<String> A, HashSet<String> B) {
        HashSet<String> C = new HashSet<String>();
        for (String element : A) {
            if (!B.contains(element)) {
                C.add(element);
            }
        }
        return C;
    }

    public static void main (String[] args) {
        HashSet<String> A = new HashSet<String>();
        HashSet<String> B = new HashSet<String>();

        A.add("X");
        A.add("Y");
        A.add("Z");
        B.add("X");
        B.add("Y");

        HashSet<String> C = setDifference(A, B);
        // Set should only contain the element "Z"
        System.out.println(C);
    }

}


来源:https://stackoverflow.com/questions/54641552/an-algorithm-to-find-the-difference-of-two-set-a-and-b-with-size-n

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!